Ben Birnbaum(@benbirnbaum) 's Twitter Profileg
Ben Birnbaum

@benbirnbaum

Machine learning engineer. Former lead of the Machine Learning Team at @flatironhealth, now diving into drug discovery and computational chemistry.

ID:21038102

linkhttp://bbirnbaum.com calendar_today16-02-2009 22:41:53

316 Tweets

461 Followers

907 Following

Joe Mou(@yoyoyojomo) 's Twitter Profile Photo

I'm hacking on Knit, a new way to collaborate with data dependencies!

Tie together SQL, Python, and other data processes. Share works in progress for data/code review. Cache across multiple users for faster runs.

Looking for early users and contributors!
purposefulserendipity.com/blog/talk-abou…

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

Overall this is a really clever idea, and one that feels very customizable and extensible. Here are links to the full paper (again) and the github:
arxiv.org/abs/2110.06389
github.com/wenhao-gao/Syn…

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

The results are competitive with the state of the art and have the very important distinction of always being for molecules that can be synthesized.

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

Each analog is represented by its embedding, and new embeddings are created via mating and mutating the embeddings of the best molecules generated. These new embeddings can then be used to guide the synthesis of new molecules, and the whole process repeats until convergence.

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

The final step is to layer in an optimization algorithm like a genetic algorithm. The procedure above is used to generate a bunch of analogs and then score them according to whatever metrics are of interest (e.g. docking, ML property prediction, MPO, etc.).

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

But this failure is actually a feature, not a bug. The compounds that are generated instead will tend to be analogous, since they are close in embedding space, and they will also be synthesizable, since only the supplied building blocks and reaction templates were used.

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

Once the model is trained, it can be run on a set of target compounds to find synthetic plans for those compounds. Sometimes the model will succeed, and sometimes it will fail.

account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

So, with a representation of what has been synthesized so far, as well as a representation of where the synthesis should go, the model has, at least in theory, what it needs to predict the next reaction step.

account_circle
Brett Berson(@brettberson) 's Twitter Profile Photo

I’m thrilled to finally share a new product we’ve been heads-down building, now available in public beta: First Round Angel Directory.

It’s the most extensive resource for founders & future founders to discover the best active angel investors in tech.

angels.firstround.com/directory

account_circle
Apoorva is in SF(@apoorvasriniva) 's Twitter Profile Photo

we're having our first Bits in Bio meetup in New York on 02/15. if you're interested in software x biotech, you should definitely stop by!

🧬where: The Brooklyneer
🧬when: 6:30 ET

see link below for more details and DM me if you have any questions.
eventbrite.com/e/bits-in-bio-…

account_circle
Therapeutics Data Commons(@ProjectTDC) 's Twitter Profile Photo

hERG coordinates the heart's beating. If a drug blocks hERG, severe adverse effects occur. Reliable prediction of hERG is thus crucial in early drug design.

TDC now includes hERG Central, with >300K data points, contributed by Ben Birnbaum!

tdcommons.ai/single_pred_ta…

hERG coordinates the heart's beating. If a drug blocks hERG, severe adverse effects occur. Reliable prediction of hERG is thus crucial in early drug design. TDC now includes hERG Central, with >300K data points, contributed by @benbirnbaum! tdcommons.ai/single_pred_ta…
account_circle
Ben Birnbaum(@benbirnbaum) 's Twitter Profile Photo

At this talk now, listening to Kexin Huang describe how to contribute data sets. I can speak from experience -- it's really easy and the TDC team is very helpful and responsive. So happy this amazing resource exists. Thanks to the team!

account_circle