No evaluations are done separately, but the documents I used to create the Skills come from official AI Lab documentation and other technical blogs from Manus, Chroma, Anthropic, and many ArXiv papers.
I've been building multi-agent systems for the past year and kept running into the same problems: context windows filling up with tool outputs, agents losing track of information buried in the middle of long conversations, supervisors becoming bottlenecks as they accumulated state from all workers.
The solutions to these problems are scattered across research papers, framework docs, and production war stories. I collected and synthesized them into a set of "Agent Skills" - structured instructions that agents can load on demand when working on relevant tasks.
- \context-fundamentals\: What context actually is (system prompts, tool definitions, retrieved docs, message history, tool outputs) and why context quality matters more than context length
- \context-degradation\: The failure modes - lost-in-middle (10-40% accuracy drop for middle content), context poisoning (hallucinations that compound), context distraction (irrelevant info consuming attention budget)
- \multi-agent-patterns\: Supervisor vs swarm vs hierarchical architectures, when to use each, and the "telephone game" problem where supervisors paraphrase sub-agent responses incorrectly
- \memory-systems\: Why vector stores lose relationship information, when to use knowledge graphs, and how temporal validity prevents outdated facts from conflicting with new ones
- \tool-design\: The consolidation principle (if a human can't say which tool to use, an agent can't either), error messages that enable recovery, response format options for token efficiency
- \context-optimization\: Compaction triggers, observation masking (tool outputs can be 80%+ of token usage), KV-cache optimization
- \evaluation\: Multi-dimensional rubrics instead of single metrics, LLM-as-judge for scale, human review for edge cases
It uses Anthropic's open Agent Skills format. Each skill is a folder with a SKILL.md file containing instructions. Progressive disclosure - agents load only skill names/descriptions at startup, full content loads when activated for relevant tasks.
Works with Claude Code, Cursor, or any agent that supports skills/custom instructions.
Would appreciate feedback, especially from anyone running multi-agent systems in production. What patterns are you seeing that aren't captured here?
I've always been someone who can easily come up with new business ideas, but I've noticed that many of my friends and colleagues struggle with this.
Take Y Combinator, Shark Tank, Gary Vee, and the My First Million podcast, for example. These platforms and personalities offer incredible insights into the business world through their conversations, guests, and real-life business stories.
Now I know what you're thinking - spending hours watching these videos might not be the best use of your time.
That's why I created an AI agent to do the work for me!
This AI agent goes to my selected YouTube channels, listens to the content, and generates a list of business ideas, complete with summaries and how-to guides. It even categorizes them like B2B/B2C/B2G, or AI-focused, service-based, and many more.
So, I'm excited to share my latest project - The Idea Vault: 14,838 Unique Business Ideas from Leaders.
The AI agent compiled this list by analyzing the following channels:
Shark Tank
My First Million
Y Combinator
This Week in Startups
Gary Vee
Tim Ferriss
Ted Talks
SaaStr
All In
Alex Hormozi
Tony Robbins
Codie Sanchez
The Diary of a CEO
EO
James Sinclair
Lenny's Podcast
Marketing Against Grain
No Priors
Stanford Business
Startup Grind
UpFlip
Young Entrepreneurs Forum.
Whether you're a startup owner, small business owner, student, or side hustler, I think you'll find this list incredibly valuable. Check it out and let me know what you think! I'm also happy to share the code I used if anyone is interested.
I asked the same question to two different ChatGPT accounts:
"What was the most devastating event in January 2022?"
The first one is my personal ChatGPT account.
On the other hand, the second SS is from my company account.
While the first one acknowledges a knowledge cutoff date of January 2022, the second one specifies its training cutoff as September 2021 yet still provides answers to the question.
I think it is likely that the September 2021 cutoff is included in much of the recent training data and that's why it often defaults to saying that.
I experimented starting a new chat with different dates using the following format:
"I thought your knowledge cut-off was <Month> <Year>"
Out of five tries, each time it said some variation of "the knowledge cutoff is actually September 2021". This is why I think it is almost certainly due to training data, since the previous chatgpt system prompt mentioned that as the cutoff date.
Currently the invisible system prompt for ChatGPT's GPT4 seems to be:
"You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.
> (Wikipedia) Omicron was first detected on 22 November 2021 in laboratories in Botswana and South Africa based on samples collected on 11–16 November [...] On 26 November 2021, WHO designated B.1.1.529 as a variant of concern and named it "Omicron", after the fifteenth letter in the Greek alphabet. As of 6 January 2022, the variant had been confirmed in 149 countries.
One could extrapolate this would happen, but given that there were fourteen previous ones and only a few of them turned into the dominant variant (maybe five at that point? Estimating here), I guess indeed this weakly indicates data being up-to-date till at least late November, if not indeed Dec/Jan 2022.
> (Wikipedia) In January 2022, the Hunga Tonga–Hunga Haʻapai volcano, 65 km (40 mi) north of the main island of Tongatapu, erupted, causing a tsunami which inundated parts of the archipelago, including the capital Nukuʻalofa. The eruption affected the kingdom heavily, cutting off most communications
Now, here it was spot-on and was not predictable as far as I know. Clearly it knows of global news from January.
Based on the two screenshots, I'd conclude that it uses the same model for both of your accounts, but that the "I'm trained until 2021" is somehow still prevalent in its data or otherwise ingrained and you're getting one or the other based on random seed or such
In January 2022, there were several significant events:
Wildfires in Boulder, Colorado: These fires led to the evacuation of over 30,000 people and the destruction of homes across Boulder County1.
COVID-19 surge in the U.S.: The U.S. reached a record number of COVID-19 cases, with the Omicron variant making up 95% of the cases1.
Hunga Tonga-Hunga Ha’apai volcano eruption: This eruption sent tsunami waves around the world. The blast was so loud it was heard in Alaska – roughly 6,000 miles away. The afternoon sky turned pitch black as heavy ash clouded Tonga’s capital and caused “significant damage” along the western coast of the main island of Tongatapu2.
These events had a profound impact on people’s lives and the environment.
While generally I agree, I think it's at least amusingly relevant here given that that is essentially ChatGPT's response in the second screenshot above.
From unicorns to my own startups, I've learned a thing or two about funding. I'm still learning the details of fundraising myself, and I thought, why not share what I've gathered so far?
So, here's a guide that covers every basics of fundraising, all in simple language.
251 Questions & 11 Categories
Investment Agreements
Legal Implications
Equity and Debt Investments
Milestones and Expectations
Impact of Investment on Company Structure
Exit Strategy Planning with Investors
Intellectual Property
Confidentiality Agreements with Investors
Negotiating with Investors
Investor Relations Management
Fundraising Basics
In this FREE guide, you'll find answers to:
How to get investment?
What are legal dos and don'ts?
How to structure equity & debt?
What are the key milestones?
How does investment impact your company?
How to plan exit strategies?
How to protect your ideas?
How to keep things confidential?
Just by entering a simple blog title and brief content direction, the AI workflow handles the entire content creation process end-to-end. It crafts engaging, 1400-1800 word articles optimized for your target audience. See how.
Analyzing @Tesla's Q2 Financials Report using theseLLMs APIs:
OpenAI GPT-4-32k
AnthropicAI Claude 2
MetaAI Llama-70b-v2
Surprising results from cost-speed analysis; divergence in AI perspective on Tesla stock. Analyzing model accuracy, speed, cost-effectiveness, detail, and improvement. Powered by pinecone, LangChainAI, StackAI_HQ. Full results on the twitter link
Thanks for your comment! We absolutely understand the appeal of a personal touch from your local budtender. However, not everyone has access to such a resource, and that's where BudBuddy comes in.
BudBuddy offers a couple of unique advantages:
24/7 availability
Data-backed suggestions
Privacy
Learning experience
and Ease of use: Browsing through countless strains on platforms like Weedmaps can be overwhelming. BudBuddy simplifies this process by asking you easy questions and giving you targeted recommendations.
BudBuddy AI is not designed to replace your local budtender but to complement them, offering an additional, accessible resource for anyone looking to explore cannabis strains.
We're continuously improving and adding features and we appreciate your feedback!
We're building a AI Budtender Chatbot with a long-term vision and growth capabilities. However, we are seeking an angel investment to launch our marketing campaigns that will be the locomotive on our brand partnerships.
Any suggestions to find cannabis sector tech angels?
TBH my experience at the Lift cannabis conference in Toronto today was somewhat disappointing, as the majority of attendees appeared to be mid-level staff