♫musicjinni

PyData Triangle March 2021 Online Meetup

video thumbnail
PyData Triangle March 2021 Meetup Zoom Meeting

0:00:00 Intro

0:00:48 Presenter: Rachael Tatman

Title: Rules + Deep Learning: Why you need both to build Conversational AI that actually works

Presentation Overview:
Current NLP research is focused on large, neural models and these models have seen a lot of success across many different applications. But to build a conversational AI system that works well in practice, there's no getting around it: you need some rules as well. This talk with put both rules and transformer models into their historical context in NLP and discuss best practices and examples for combining them in hybrid systems.

Bio:
Rachael is a developer advocate for Rasa, where she's helping developers build and deploy conversational AI applications using their open source framework.

Rachael has a PhD in Linguistics from the University of Washington. Her research was on computational sociolinguistics, or how our social identity affects the way we use language in computational contexts. Previously she was a data scientist at Kaggle and is still a Grandmaster.

0:42:05 Presenter: Alex Lew

Title: Probabilistic Scripting for Common-Sense Data Cleaning at Scale

Presentation Overview:
Real-world data is often messy and incomplete, littered with typos, duplicates, NULL values, and other errors or inconsistencies. Although cleaning dirty data is important for many workflows, it has proven difficult to automate: cleaning often requires common-sense reasoning and judgment calls about objects in the world.

In this talk, I’ll introduce a new declarative-programming approach to automating common-sense data cleaning, based on recent advances in probabilistic programming. Our system, PClean, allows users to declare their uncertain knowledge about their datasets declaratively, and compiles efficient cleaning algorithms guided by the scripts. We’ll look at the probabilistic programming ideas that make PClean tick, and show how short ( less than 50-line) scripts can achieve state-of-the-art accuracy and performance on several cleaning tasks, scaling to millions of rows.

Bio:
Alex Lew is a Ph.D. student at MIT's Probabilistic Computing Project, and a lead researcher for Metaprob, an open-source probabilistic programming language embedded in Clojure(Script). He aims to build tools that empower everyone to use probabilistic modeling and inference to solve problems creatively. Before coming to MIT, Alex designed and taught a four-year high-school computer science curriculum at the Commonwealth School in Boston. And before that, I was a student at Yale, where I received a B.S. in computer science and mathematics in 2015. A native of Durham, NC, he also returns home each summer to teach at the Duke Machine Learning Summer School (and spend time with his family and their dogs!).

===
www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps

James Powell: So you want to be a Python expert? | PyData Seattle 2017

Dmitry Petrov: Machine Learning Models Versioning Using Open Source Tools | PyData LA 2019

Miroslav Šedivý - Python Lets go home quickly| PyData Global 2020

Sebastiaan J. van Zelst: Process Mining in Python | PyData Eindhoven 2019

Daniel Rodriguez: Effective Python and R collaboration | PyData New York 2019

Tailai Wen: ADTK: An open-source Python toolkit for anomaly detection in... | PyData Austin 2019

Dante Gama Dessavre: Open Source is Better Together- GPU Python Libraries Unite | PyData LA 2019

Improving your Python skills with CodinGame.com | John Stinson | PyData Pune Meetup | July 2020

Building your first chatbot in Python - Rachael Tatman | PyData Jeddah

… - James Powell

Adrien Treuille: Turn Python Scripts into Beautiful ML Tools | PyData LA 2019

James Powell: What You Got Is What You Got | PyData LA 2019

James Powell: I Just Inherited 50,000 Lines of Code! What Now? — A Practical Guide | PyData LA 2018

Travis E Oliphant: Extending Python Into the Future | PyData Austin 2019

Alexander Hendorf - Better Code for Data Science | PyData Global 2020

Chiin Rui Tan- Ipywidgets for Education! | PyData Global 2020

Effective Pandas I Matt Harrison I PyData Salt Lake City Meetup

Pydata Berlin Meetup February 2021: Bulk Labelling

Olszewski & Otmianowski: How to efficiently model learner’s knowledge with... | PyData Warsaw 2019

Aaron Richter- Parallel Processing in Python| PyData Global 2020

Moussa Taifi: Clean Machine Learning Code: Practical Software Engineering... | PyData New York 2019

Count - Taylor Brownlow

Dash: data exploration web apps in pure Python - Chelsea Douglas

Ankit Mahato- Supercharge Scientific Computing in Python with Numba | PyData Global 2020

Gajendra Deshpande- Inventing Curriculum using Python and spaCy | PyData Global 2020

Roman Yurchak- Pyodide Scientific Python Compiled to Webassembly Optimized| PyData Global 2020

Junpeng Lao: A Hitchhiker's Guide to designing a Bayesian library in Python | PyData Córdoba

James Powell: Sloth & ENVy | PyData New York 2019

Ekhtiar Syed: Exploratory Data Analysis (EDA) and Visualization Techniques.. | PyData Eindhoven 2019

Jan Freyberg: Active learning in the interactive python environment | PyData London 2019

Disclaimer DMCA