InTDS Archiveby💡Mike ShakhomirovAdvanced SQL for Data ScienceExpert techniques to elevate your analysisAug 24, 20243Aug 24, 20243
InData Engineer ThingsbyVu TrinhI spent 8 hours learning Parquet. Here’s what I discoveredI finally sat down and learned about it.Aug 24, 202423Aug 24, 202423
Patrick Cuba1. Data Vault and Domain Driven Design“It is not the domain experts’ knowledge that goes to production, it is the assumption of the developers.” — Alberto BrandoliniSep 5, 20221Sep 5, 20221
Patrick CubaData Vault on Snowflake: Performance Tuning with KeysSnowflake continues to set the standard for Data in the Cloud by taking away the need to perform maintenance tasks on your data platform…Jul 19, 2023Jul 19, 2023
Shawn TngAutomated Scoring For Tiktok Dance ChallengeThis is a continuation of my previous post, where Google’s Mediapipe was used for multi-person pose estimation. From each video frame, we…May 20, 2021May 20, 2021
InBootcampbyJosh Cottrell-SchloemerExcel is your most overlooked design toolA designer’s perspective on the world’s #1 spreadsheet tool — how to build infographics, dashboards, presentations & moreApr 19, 202212Apr 19, 202212
Saumil MehtaWhy You’re Paid What You’re Paid — Five Key Tech Compensation TakeawaysWhat do you make every year? What kinds of raises have you gotten over the last few years? How are you valuing cash vs. equity? How much…Apr 2, 202213Apr 2, 202213
InTDS ArchivebyKhuyen TranGreat Expectations: Always Know What to Expect From Your DataEnsure Your Data Works as Expected Using PythonOct 8, 20213Oct 8, 20213
InTDS ArchivebyJosh TaylorFuzzy matching at scaleFrom 3.7 hours to 0.2 seconds. How to perform intelligent string matching in a way that can scale to even the biggest data sets.Jul 1, 201917Jul 1, 201917
InTDS ArchivebyNaim KabirJinja + SQL = ❤️Macros for maintainable, testable data analyticsAug 15, 20212Aug 15, 20212
Kosma FuławkaWhat I’ve learned setting up 12 Databricks environmentsData Engineering in practice. Preparing an enterprise grade environment in a huge organization is not a piece of cake. In fact, it is quite…Nov 19, 20211Nov 19, 20211
InBluecore EngineeringbyJessica LaughlinWe’re All Using Airflow Wrong and How to Fix ItTl;dr: only use Kubernetes OperatorsAug 3, 201855Aug 3, 201855
Ruurtjan PulUnderstanding Kafka with FactorioWhile playing Factorio the other day, I was struck by the many similarities with Apache Kafka.Apr 27, 20196Apr 27, 20196
InThe Airbnb Tech BlogbyVaughn QuossData Quality at AirbnbPart 2 — A New Gold StandardNov 24, 20206Nov 24, 20206
Patrick BaconWhy do Aggressive Defensemen Experience Sharp Declines at Young Ages?Alex Pietrangelo’s rough start is an excellent case study of a broader concept.Nov 4, 20214Nov 4, 20214
David B.Old: Introducing a v2 - version of org-chart libCheckout v3 org chart introduction hereAug 14, 20218Aug 14, 20218
InTDS ArchivebySara A. Metwalli5 Games That Can Help You Improve Your Skills As a Data ScientistImprove your skills and have fun at the same time.Sep 19, 20212Sep 19, 20212
InTDS ArchivebyPaul Singman3 Data Lake Anti-Patterns to AvoidRid yourself of these troubling habits and start the journey towards data lake mastery!Mar 30, 20211Mar 30, 20211
InFT Product & TechnologybyMihail PetkovFinancial Times Data Platform: From zero to heroAn in-depth walkthrough of the evolution of our Data PlatformDec 2, 20209Dec 2, 20209
InCodeXbyRudderStackThe Future of Data Pipeline Tools Must Include Better Transformations Than ETL Ever HadEverybody hates data transformations in their pipeline tools. Developers hated the transformations in ETL tools so much that they came up…Jul 12, 20213Jul 12, 20213