Managing and building ETL pipelines for various analytical use-cases across the foundation, deriving metrics for the community, helping assess the health of the foundation. Monitoring system health as part of Ops-Week, contributing to data governance and privacy initiatives. Modernizing the platform for streaming and batch processes. There is always something to improve, there is always new use-cases to build. Fun!
Worked on highly durable, available, scalable distributed write ahead log system for AWS. Was responsible for end-to-end service development, improvement, deployment, and on-call rotations. Launched features to automatically manage high volume customers which included writing design docs, developing durable, low-latency APIs while ensuring smooth customer experience.
Worked with Professor Daniel G. Brown on analyzing the ability of LLMs to respond appropriately and consistently to sensitive topics with prompt variations. We analyzed over 30 models, both open and closed source, compared their performance in various areas like task understanding and response consistency. Surprisingly, our findings indicate that most models, even some large open-source models have a difficult time answering simple Yes/No questions and can barely understand the task at hand. They are prone to changing their responses with slight variations in prompt wording and have different responses in different settings. These findings warn us against using LLMs for Q/A without proper planning and testing given their limited instruction-following and task-understanding capacity.
I worked with the Research Team as a Research Data Scientist (NLP) to develop Copyediting as a structured task. Meta page, Report, Code.
As a contract data analyst, I worked with the Search and Analytics team to analyze SPARQL queries along with Wikidata dump to help scale the wikidata query service. Analysis work is done using both Spark (Scala) and PySpark (Python) on Hadoop clusters. Analysis work: wikitech/User:AKhatun.
Selected as an Intern in Outreachy to work with the Abstract Wikipedia project under Wikimedia Foundation.
Performed data analysis and applied machine learning algorithms on computer vision and time-series data for pattern recognition and prediction generation.
Worked on developing larger datasets and implementing transfer learning based deep learning approaches for Authorship Attribution in Bengali Literature, thus far surpassing the existing systems. Work available in GitHub. Datasets available in Mendeley.
Completed Masters thesis research on the ability of LLMs to respond appropriately and consistently to sensitive topics with prompt variations.
Advisor: Daniel G. Brown
(Thesis Report)
Courses taken:
CS848 F22: The Art and Science of Empirical Computer Science
CS848 F22: Knowledge Graphs
CS889 W23: InfoVis for AI Explainability
CS889 S23: Value-Driven Technology
Completed undergraduate thesis on Authorship Attribution in Bangla Literature.
Applied deep learning NLP techniques to achieve high performing scalable systems.
Advisor: Md Saiful Islam,
Ayesha Tasnim
(Thesis Report)
Core Courses: Algorithm Design and Analysis, Data Structure, Database System, Object Oriented Programming, Software Engineering and Design Patterns, Technical Writing and Presentation, Artificial Intelligence, Introduction to Data Science, Machine Learning
Received the Barbara Hayes-Roth Award for Women in Math and Computer Science for demonstrated academic excellence as a graduate student in University of Waterloo.
Graduate student award of excellence from University of Waterloo.
Selected as one of 54 Outreachy Interns among 1000+ applicants through contributions in various Open Source projects.
Scholarship for an Udacity Nanodegree from Facebook and Udacity. Selected as a candidate among 300 out of 6000 competing scholars from the challenge course(Secure and Private AI). Pursued Computer Vision Nanodegree through this scholarship.
2nd Place, Best Research Poster Award, ICBSLP (International Conference). Presented our work on Authorship Attribution in Bengali Literature using transfer learning and compared it to existing systems and character-level CNN architectures.
Education Board Scholarship during undergraduate (4 years long) for best performance nation-wide awarded by the Education Board, Government of Bangladesh.
Leading the Research team at Tech+ UWaterloo, identifying DEI in Tech trends and analyzing the impact of Tech+ programs and events.
Mentored a group of undergraduate students through the Directed Reading Program (DRP) where undergraduate students get introduced to new topics in Math and Computer Science and possibly some gentle introduction to research. My group learned about LLMs, ways to set up and use a personal LLM, and its applications in creative endeavors like story-telling.
Conducted a hands-on beginners AI workshop at University of Waterloo for the WiCS (Women in CS) Conference 2023. Attendees included Undergraduate and High School students. Attendees learned about AI and were given a run down on a simple ML problem using the Titanic Kaggle Competition.
Panelist in SPARCS workshop, Centre for Education in Mathematics and Computing. Discussed my research and various opportunities in CS with underrepresented students in Grade 9-10 across Canada. The aim was to bust the myths of Computer Science and invite inclusivity in the field, to show students how Computer Science is welcoming to all, whether you are math-savvy or not, tech-savvy or not, and what the recent career prospects look like for Computer Science students. Included a short presentation followed by QA session.
The catalyst conference is an exciting opportunity for grade 11 girls and non-binary students to participate in hands-on workshops,
explore the University of Waterloo, meet engineering students, compete in a design competition, and experience life in residence.
Volunteered to guide students during workshops and other activities. Engaged in conversations about Women in Engineering and life at UWaterloo.
More about Catalyst and Catalyst Conference.
Helped set up and guide and mentor students during multiple Women in CS technovation events at UWaterloo.
Technovation Girls Waterloo.
Organized events and helped applicants, mentors, and interns in all steps pertaining to the internships.
Panelist for 2 sessions and a project presenter at Wikidata Conference (WikidataCon). Sessions, Video.
Since my work during Outreachy Internship with Abstract Wikipedia,
I have been working on improving and developing the abstract-wiki-ds tool
to better perform clustering on source code.
Phabricator: T263678
Conducted a series of IEEE beginners Machine Learning Workshop. Workshop materials available in GitHub.
Trained junior year undergraduate students for Competitive Programming.
This nanodegree was the 2nd phase of the Secure and Private AI Facebook Udacity Scholarship. Learned and applied Image Processing, Transfer learning, Kalman filters, Graph SLAM algorithm. Completed projects include Day-Night Detection, Facial keypoint detection, Object Detection, Image Captioning, Sentiment analysis and, Object Localization and Mapping.
CertificateA bundle of Datacamp courses for python, data cleaning, manipulation and analysis, pandas, data visualization, SQL, statistical thinking and, machine learning.
CertificateJeremy Howard
Machine learning course and Deep learning specialization by Andrew Ng, Coursera. These courses cover everything from the basics of machine learning and neural networks from scratch to deep learning techniques in computer vision and NLP with CNN, RNN, GRU and LSTMs, hyperparameter tuning and structuring machine learning projects.
Machine Learning Certificate
DIY (Do It Yourself) Track.
This summer of code was my introduction to Python and Machine Learning for the first time ever.
Tons of amazing volunteer mentors helped me a lot, from setting up python to understanding support vector machines, random forests.
Setting up python and anaconda, in windows(!), was quite messy. Multiple python and too many incompatibility issues. I have come a long way since then.
Did a lot of assignments and solved ML problems with the completely hands workshops in this SoC.