ShaShekhar's comments

ShaShekhar · on July 19, 2024

Let's start an open-source, non-profit AI government. Create LLM agents to represent every politician, decision-maker, and policymaker. By simulating their decision-making processes, we can measure their performance against actual human decisions. This would involve considering both short-term and long-term consequences. Running this experiment for 5-10 years would generate valuable comparison data, allowing the public to interact with and contribute to the development of the project.

A week ago, Balaji announced a challenge to create an open-source demo of a President Biden deepfake that would outperform the teleprompter-reading Biden. This isn't new; we've seen such videos before. However, there hasn't been a project that combines all the open-source models and allows us to talk and interact with the deepfake.

This is the initial release. Responses will stream live. To make this real-time on consumer GPUs and indistinguishable, a lot of work needs to be done, such as aligning LLMs, training text-to-speech, and optimizing the LipSync video generation model.

Please check out this project on Github.

ShaShekhar · on July 2, 2024

Thanks. Stable diffusion inpainting v1.5. I'd played around this model so much that i ended up using it. I've read both papers SDEdit where you need mask for inpaiting and instructPix2Pix where you don't. I know, i'm a year behind when it comes to using new models like LEDIT++, LCM, SDXL inapainting etc. There is so much work to do. VCs won't fund me as it's not a b2b spinoff.

ShaShekhar · on July 2, 2024

instructpix2pix is fine-tuned on sd-v1.5 which is a inpainting model (aware of contexts and semantics) that why it don't require mask.

ShaShekhar · on July 2, 2024

Right now, the inpainting is done on semantic mask (output from segmentation model). For more complex instruction, we also have to support contextual mask generation, which is an active area of research in the field of Visual Language Model. When it comes to perform several iteration, you can also do that on semantic level or get a batch of output. The sdv1.5 inpainting model is quite weak and we haven't seen any large scale open source inpainting model for a while.

ShaShekhar · on July 2, 2024

The tools are there, we just have to connect it (check out the TODO section). For more complex instruction like when you want to create the mask, it requires a lot of contextual reasoning which i tried to point out in Research section.

ShaShekhar · on July 1, 2024

I did integrated and tested the microsoft phi3-mini and it works really well. Having freedom to run locally without sharing private photo is my utmost objective.

ShaShekhar · on July 1, 2024

Example instructions: 1. Replace the sky with a deep blue sky then replace the mountain with a Himalayan mountain covered in snow. 2. Stylize the car with a cyberpunk aesthetic, then change the background to a neon-lit cityscape at night. 2. Replace the person with sculpture complementing the architecture.

Check out the Research section for more complex instructions.

ShaShekhar · on July 1, 2024

I did it intentionally. The video had my voice, but then I decided to replace it with an AI voice.