More

skilbjo · 2025-05-06T04:10:49 1746504649

how are you thinking this will be deployed and how will developers initially integrate this? ie is this a language-level library, or an internal service in a docker container to deploy?

one idea, is vitepress (vitepress.dev) has a local search engine, it would be cool to have moss integrated in that project, https://github.com/vuejs/vitepress

srimalireddi · 2025-05-06T07:33:18 1746516798

We provide MOSS as a lightweight TypeScript/WASM library that developers can drop directly into their applications. It's designed for easy frontend integration - no backend services or Docker setups required. With just a few lines of setup, you can instantly index your multi-modal data and start running real-time semantic search entirely on-device.

We love the idea of integrating MOSS with VitePress - it's exactly the kind of high-performance, client-side experience where on-device semantic search could shine. If anyone here is connected with the VitePress maintainers or community, we'd appreciate an introduction! We'd be happy to collaborate or contribute if there’s interest in exploring an integration together.

skilbjo · 2025-04-16T00:38:28 1744763908

pretty interesting discovery if that was the hack.

do you know what the legal implications are for this?

if the company that owns 4chan finds the identity of the attacker, could they sue him in civil court? or do they send whatever logs they have to the FBI and the FBI would initiate a criminal prosecution? also what is the criminal act here? is it accessing their systems, or is it posting the data that they found "through unauthorised means" on a public channel like twitter? does the "computer fraud and abuse act" apply?

like if you found this exploit, and sent it to the company in good faith (ie a "good hacker"), are you free from prosecution? and what is the grey area, like if you found this exploit and then just sat on it for a while (let's say you didn't report it to the company, but let's also say you didn't abuse it, ie leak private data to twitter)

mmcwilliams · 2025-04-16T04:28:13 1744777693

Assuming US jurisdiction this would pretty clearly be at least one, probably many CFAA violations which are criminal.

skilbjo · 2025-04-07T15:46:56 1744040816

great to see this. two questions:

-would i be able to publish a link to my canonical openapi spec if my service, or i should plan on doing a PR in this repo for the openapi artefact?

my spec infrequently changes, but it sometimes does, so how do ideally updates happen whereas my current workflow is publishing an openapi spec to a public link

-how does this (and Arazzo) interact with MCP?

is this meant to interact along side MCP, a replacement for it, etc?

seanblanchfield · 2025-04-07T18:47:11 1744051631

Jentic co-founder here. Right now, you've got to do a PR, but we plan to monitor the web for new OpenAPI documents and automatically load them in within 24 hours.

Once ingested, we will monitor the original URL for updates. We plan to enrich ingested OpenAPI docs with any additional information we can find on the web (and live agent telemetry). These enrichments will include some spec extensions for additional info agents need (e.g., how to enrol/authenticate, rate limits, pricing, licensing, trust & safety, side-effects, rollback etc).

We will be careful not to clobber any 1st party docs with AI content, and to intelligently merge any AI enrichments into future versions.

Note that a lot of APIs do not have good OpenAPI documentation, and so we'll be generating those from scratch.

In addition, we have an agent that reads any OpenAPI specs in the repo, generating potentially useful workflows composed from OpenAPI operations and other workflows (with all workflows represented in Arazzo specs). That where all the Arazzo specs currently in the repo came from.

MCP is ideal for agents to connect to services, but is not designed to represent the depth of API knowledge we are aiming to represent (and it would be worse at its primary job if it tried to). We will shortly release our own MCP server that provides agents with convenient access to interact with this API repository over MCP. For example, to search for operations and workflows that fit a current sub-goal, and to load details so they can more reliably execute a chosen operation/workflow (assisted by a OSS library we'll release soon) and to interpret the responses intelligently.

skilbjo · 2025-03-08T07:50:45 1741420245

hi, few questions,

-who is your ideal user/customer? what stack are they using, and why do they use you?

-why did you decide to work on this? (ie, what inspired you?)

-how do you compare against openapi templated/generated SDKs? ie i use openapi-typescript (https://openapi-ts.dev/) and it is fantastic

-there seem to be some VC backed companies in the space: https://news.ycombinator.com/item?id=40147281, what is your edge?

-as someone who tried to maintain a few SDKs in my non-familiar languages (go, java, c#), i think i definitely see a paid use case for giving the problem to someone else who can maintain a quality SDK, but there are already 3-4 VC backed companies in the space, and i think the existing free tools are not bad if you spend some time with them, what is the gap/thing they're missing?

ty, john

bauchdj · 2025-03-08T17:37:43 1741455463

Hi John,

ICP: developers and teams that want to generate simple, extendable SDKs. Our goal is to bring a successful successor to the OpenAPI generator project. Most OpenAPI open source projects come and go. We want to offer a open source SDK generator suite that is dependable, yet simple.

Why: We met with many developers who want an open source successor to OA generator that generated readable code and can be extended very easily. We partnered with Trieve from YC W24 to build the Python SDK generator to start.

-2 main differences. Our goal is to bring a core open source solution across languages and to keep all the generated code human readable and easy to extend. We recognize there are projects in the space that work great. However, finding the best solution for each language is different and time consuming.

Edge: Open source first and community driven. Simpler cleaner code. Extendable by design.

There are solutions but each one has it limitations. We can generate an SDK that works, while they current paid solutions will say the OpenAPI doc is invalid. We can generate human readable code, they don't. We want to support all languages for free, paid solutions don't. Open source projects that solve one language are often abandoned or forgotten and are extremely hard to find. Then, you have to determine which one you will use for each language.

Hope that helps,

David, Co-founder at Borea

skilbjo · on Dec 28, 2024

looks sick!

skilbjo · on Dec 21, 2024

one thing to add:

vikram is an ML engineer, and i (john) used to be a data engineer for 5+ years. vikram and i will be working on turning my brain into Ardent, the world’s best data engineer, that companies can use, at a fraction of the cost of a data engineering hire.

not only can Ardent create airflow pipelines, but it also monitors the jobs, so if/when a run fails, it self heals and gets the pipeline working again.

so many past experiences being under stress and getting paged at midnight because the data warehouse hasn’t updated today for the latest day — well, with Ardent, no longer

downrightmike · on Dec 21, 2024

"at a fraction of the cost of a data engineering hire." for now, no one is paying the actual cost of AI. Human workers will be cheaper until we get to angstrom level chip[ designs

skilbjo · on Dec 22, 2024

>no one is paying the actual cost of AI

what do you mean by this?

skilbjo · on Dec 10, 2024

Sweet! trying this out now.

one quick nit on your docs: https://docs.hyperbrowser.ai/guides/scrape-site

```

import Hyperbrowser from "@hyperbrowser/sdk";

const client = new Hyperbrowser({ apiKey: process.env.HYPERBROWSER_API_KEY, });

(async () => {

  const job = await client.startScrapeJob({ url: "https://example.com" });

  console.log(job);

  const job = await client.getScrapeJob(job.jobId);

  console.log(job);

})(); ```

should be:

```

import Hyperbrowser from "@hyperbrowser/sdk";

const client = new Hyperbrowser({ apiKey: process.env.HYPERBROWSER_API_KEY, });

(async () => {

  let job = await client.startScrapeJob({ url: "https://example.com" }); // s/const/let

  console.log(job);

  job = await client.getScrapeJob(job.jobId); // remove const

  console.log(job);

})();

```

shrisukhani · on Dec 10, 2024

thanks! fixing now

shrisukhani · on Dec 10, 2024

update: fixed now :)

skilbjo · on Dec 7, 2024

I should have been more clear here -- self signup is in the works! But until then, fill out the form or shoot me an email - john[at]xhr.dev and I'll send you API keys. Unless you think I should post a public API key ?

What I want to know -- do HN users like yourself have specific sites in mind + how valuable is my service? This info is so valuable to me!

skilbjo · on Nov 12, 2024

take a look at https://xhr.dev/, a product I built to avoid bot detection from things like cloudflare, imperva, aws waf, and others

enahs-sf · on Nov 12, 2024

What does the $500 a month get me? Infinite resources to scrape all of LinkedIn?

Zopieux · on Nov 12, 2024

>self host (Docker): $60k/yr

lmao ok

skilbjo · on Oct 30, 2024

take a look at https://xhr.dev/, a product I built to avoid bot detection challenges in the first place.