Yes, federated queries (external tables) are supported but that is a lot slower than ingesting the data into Snowflake's storage and querying it. Since Snowflake's pricing model is based on computation time, querying external tables are usually more costly because of worse performance.
I'm not surprised. LLMs are not good at those problems. There is a lot of hype, it not good at every problem, but it is quite good at some of them. If you have a classical NLP task, then it is good, particularly GPT4. If you have a generative problem where you don't care about mistakes, such as grocery list, marketing copy rough draft, etc, then it is good.
LLMs are not good at search, math, encyclopedias, logic engines. Maybe some day they will be, but not yet.
Similar experience for me. I found that many of my aches and pains as I get older are resolved by exercise and stretching. For example, I hurt my lower back in my 20s. Not extremely bad, but bad enough that it still gets sore if my back gets too weak after 20 years. I stretch and strengthen it without using weights, and it eventually gets better. I stopped running for a decade then decided to pick it back up. My knees were sore or hurt mildly initially, but eventually they got strong enough that they got better. Running is also great for strengthening the lower back.
Trial and error. I recommend looking up non-weighted exercise, stretches, and activities that target the body parts you are interested in, and start trying them.
CockroachDB and Yugabyte are Postgres compatible. GCP Spanner also has a Postgres compatibility layer. CockroachDB and Spanner support foreign keys across shards which is an interesting feature.
Google engineers invented most of the primary innovations in NLP in the last decade, such as transformers and word2vec which started it all, that led to the current stack that OpenAI is built on. Their best public models certainly lag OpenAI, but it certainly isn't vaporware.
Drug trials are already double-blind and highly regulated, particularly stage 3, so further controlling for the corruption you are suggesting doesn't seem necessary. If there is evidence (hard evidence not hearsay or conspiracy theories) of that sort of corruption, then I think what you suggest is a good improvement.
It is extremely common among research universities where they try to own all your ideas. I remember a professor had a side business cleaning pools which had nothing to do with his research area. The university found out and wanted their cut. He gave it to them because it was better than being fired and sued.
These types of contracts are illegal in CA, but legal in many other states.
I agree LLMs alone aren't good at search but their embeddings replace the need for stemming, manual synonym lists, etc in most cases. LLMs can also be used for query understanding which can improve the keywords submitted to the engine and extracting the best snippet for a highlight. LLMs + search are better than either alone. However LLMs still have an inference performance/cost issue which may make them unsuitable for some search use cases.