Ask HN: I just got $100k in AWS credits, how should I use it?

dasmithii · on May 14, 2015

Please oh please help the Tor network. You will be eternally loved.

nathan_f77 · on May 14, 2015

Haha wow, you could set up a pretty impressive number of relays for that much money.

nsajko · on May 14, 2015

Do I remember right that the Tor project wouldn't accept a very large number of relays from a single party because it gives too much influence over the network?

developer1 · on May 15, 2015

IIRC, it takes about 50% of the network to do so. This is a waste of $100k though, as the network is only used by a small number of people, and this would be a permanent sinkhole of the money and these added servers would simply disappear after a year.

This guy/gal definitely needs something more interesting. Perhaps something meaningful, or perhaps just something insanely cool. A temporary addition to an existing worker farm is not a great idea.

panax · on May 15, 2015

It would be great to help Tor and set up a ton of relays. But then again how much can we trust Amazon? And unfortunately AWS doesn't allow exit nodes.

A good alternative is Folding@Home

philip1209 · on May 14, 2015

Try providing branch and bound solvers as a service? Spin up some massive EC2 instances, run something like the CoinCBC solver (or license Gurobi by the EC2 hour), and let people run optimization problems on shared hosts. Charge per minute and assume that you get ~60% utilization on the instances with a queue. Maybe allow problem formulation using the JuliaOpt / JumP metalanguage.

The hard thing about optimization problems is that they take on the order of minutes to run, but you're billed by the hour.

Sounds crazy, but lots of startups - ranging from OnFleet to Lyft Line to Postmates - are probably computationally bound on problems like the Traveling Salesman Problem / bin packing problem / knapsack problem. It's not worth $1400/instance/month to spin up the biggest computation nodes because they would get low utilization, but they still want their problem to solve quickly. If you bill by the minute - then they save money and time.

Implementation would be straightforward - set up a queuing model, a timeout on each problem, and do a callback when the problem finishes. I think it's an untapped market because there has been lots of software developed for ML, but little for decision making built on top of the regression models that ML outputs.

shoo · on May 15, 2015

I like your point about software for ML, vs software for decision making. I reckon there is a lot of existing software for decision making, but it is focused around particular domains/industries. A few months ago there was some news about a new startup offering experimental design as a service -- that's another related, seemingly under-explored idea.

Branch & bound solvers (aka combinatorial optimisation) as a service doesn't sound crazy to me, but perhaps one would need to think very carefully about the market.

What kind of customers would this service have?

If you are aiming to win customers by offering a cheaper service than alternatives, then I suppose the individual customers would need to be only the ones where it made more sense to rent this infrastructure instead of invest in their own. I.e., they would need some usage, but not usage heavy enough to justify investment in their own infrastructure, which would be cheaper for them in the long run.

Might have to think about data-security issues too, if commercially sensitive data is being uploaded to the solver back-ends.

Context:

I have worked somewhere that internally runs a service vaguely similar to what you describe. E.g. licensed commercial solver, sitting on a server with a decent amount of memory and compute, used as a back-end by various services to solve sufficiently valuable business problems for clients.

If one built a service like this, another idea is to keep it to yourself, and partner with some operations research consultants, then go directly after the business problems.

aymeric · on May 14, 2015

Offer a performance testing service.

I saw a company pay $30K for 5 days access to a similar service.

Bootstrap the service by using this script under the hood and improve it overtime: https://github.com/newsapps/beeswithmachineguns

Use your $100K credit for spawning instance but also as a marketing hook: "first 1 day free!"

txutxu · on May 14, 2015

Or...

Try to build any product that offers larger amount of free evaluation period.

If before 1 year, you have $paid_customers > 100.000, continue, go on, otherwise, just kill it.

Also there is more services in AWS, than EC2 and S3.

Only exploring them gives plenty of ideas... machine learning, transcoding, no- and sql scalable databases, mailing or dns services, world-wide advanced networking and delivery, etc, etc, etc... on an scalable and elastic way, high composability and awesome APIs and docs.

wtf, even you can use that money on yourself, purpose it for your own self-learning of AWS... put your $idea on world-wide high availability, automating all the AWS integrable components, and provision it thinking about time zones and world-wide timezones, usage peaks, etc.

Any ideas related to social-network-effect, or massive concurrency, are nice with an "elastic" architecture. For example, build a game that you can play from different social networks!

Last, if you're free to choose where the money goes, in your situation I could donate a some part, to my favorite opensource projects and supporters. Should feel good.

mryan · on May 15, 2015

> I saw a company pay $30K for 5 days access to a similar service.

Do you know what features this service had that made it worth $30k for 5 days access? Did they have some unique and useful features, or was it more about the available bandwidth/RPS that could be generated?

HenryTheHorse · on May 15, 2015

This.

Enterprise pays a boat-load to HP and others for performance/load/stress testing. Someone needs to figure out how to offer serious performance testing at an affordable price.

PriyankBhatt · on May 17, 2015

Which is the company that you are hinting at ??

fragsworth · on May 14, 2015

You could calculate the highest-quality rendering ever of the Buddhabrot (http://upload.wikimedia.org/wikipedia/commons/7/77/Buddhabro...). The way it's generated makes it impossible to "zoom in". You have to process the whole thing.

To make it really make a splash, you can make a an incredible video by iterating through the parameters of the functions - fractals turn into beautiful videos when you manipulate random parameters. This requires rendering the buddhabrot N times for N frames of video.

kefka · on May 14, 2015

This is what I would do.

3 Steps:

1. Scrape every possible image along with location data (if available). Save all these images in Amazon storage. It's best if you can scrape photo galleries that include building names, sites, or other location descriptive data. Questionable gray area, but this is a mashup of thousands of images.

2. This is now your photogrammetry grid. Take all those photos and generate 3d scenes from the data you scraped.

3. Open up shop with these 3d assets. Charge for quality of object. Extra money if you make it easy to import into UE4, Unity, or Torque and make it "Ready for the Oculus Rift".

radicalbyte · on May 14, 2015

4. Get sued for copyright infringement.

chrischen · on May 14, 2015

That would make a splash.

matt_morgan · on May 14, 2015

It's a new work, with substantial creativity involved. I don't see it getting anywhere.

jhayward · on May 14, 2015

"Substantial creativity" may get you past being a derivative work in Europe, but not in the US. Copyright law differs between jurisdictions. You'd have to be very careful.

I like the original idea. What I'd do is make sure the resulting images don't have any significant reliance on any small set of originals. So if challenged, you could re-create the scene w/o the challenged images and show a court that the scene is not closely derived from any single source.

teraflop · on May 14, 2015

Having "substantial creativity involved" does not prevent something from being legally considered a derivative work.

davidmr · on May 14, 2015

And furthermore, you cannot pay your lawyer in Amazon credits.

btown · on May 14, 2015

Unless your lawyer's name is Ed Felten, your name is Barack Obama, and the Amazon credits can be applied to GovCloud!

k_sze · on May 15, 2015

... 6. Profit.

3327 · on May 14, 2015

thats part of business

pc86 · on May 14, 2015

You should try getting into a more honest business, then.

kefka · on May 14, 2015

Ok. Prove I used Exhibit A in making of this 3d scene.

pbhjpbhj · on May 14, 2015

Well we've got the server logs showing your AWS instance accessing that image on our server. As copyright is tort the usual burden of proof is balance of probabilities [UK, "preponderance of evidence" in USA I think; please correct if this is wrong in your jurisdiction], I'd say that's enough to swing it that you're going to need to prove you didn't use that image ... oh and we have a HN post replying to you suggesting you do this, which swings the balance a touch further.

We can probably have an expert witness testify the scene could use that image (ie they don't visually disagree so much that the scene couldn't have derived from inter alia that image).

Not enough perhaps to prove a criminal case ...

chralieboy · on May 14, 2015

"Here is a subpoena showing the IP address of the AWS instance you controlled along with server logs showing you accessed that image.

Based on public statements stating how you compiled the images and comparison between the client's photo and your image, it is not beyond reasonable doubt that infringement likely occurred."

hauget · on May 14, 2015

Unrelated, but are you talking about creating 3d models out of point cloud data or something else?

kefka · on May 15, 2015

Nope. Pointcloud is just depth data and potentially color data.

Photogrammetry is the technique of 3d scanning that correlates feature points within multiple pictures in order to back-project a 3d scene.

The more pictures you have of an area, the higher quality the overall scan. So if we have 3000 images of a building's exterior in NYC, we can recreate the building in 3d.

My idea was that, for X thousand images, a single image is a trivial datapoint, and could be easily removed with little loss in quality of scan. It may technically be in violation of copyright, but is used for a substantially different work.

I believe it could possibly qualify as fair use.

deepnet · on May 15, 2015

Using a SIFT pipeline for photogrammettry, I have successfully recreated small objects, Comet 67P and some buildings from quadcopter pics.

First I search for features with DoG and match them with SIFT , do a bit extra crawling along matched edges and the result is a dense coloured point cloud.

The point clouds are converted to a mesh with poisson surface reconstruction and retextured with fragments of the original images.

The Poisson surfaces are never quite as nice as the point clouds - I am using Meshlab for this part.

Processing a few hundred big images takes ages so I send the jobs up to EC2 for a few hours so each job is usually a couple dollars.

It is pretty useful as a 3D scanner.

hauget · on May 18, 2015

That-sounds-amazing. Any chance you could share a link of the end result? Also, any suggestions on how to get started with photogrammetry & computer vision?

dharma1 · on May 22, 2015

would love to chat about this - do you have a contact email?

nightski · on May 15, 2015

If you have correlated feature points you essentially have a point cloud no? It's just that photogrammetry adds an additional step of back-projecting the 3d scene?

Bladtman · on May 14, 2015

Train some mad, distributed neural nets or some shit. Gather image/art data and train a net to determine beauty. Solve all possible sudokus so the world can finally be free of that junk. Make cloud-driven instant facial recognition (via social media images) a thing. Build a huge pi-as-a-service, and use it with pifs (https://github.com/philipl/pifs)

Bonus points: worlds biggest lolcat host.

nartz · on May 14, 2015

The training of the models is expensive, but after you have the parameters, saving and using the model becomes cheaper (which meets your needs).

chrischen · on May 14, 2015

Actually I like the gimmicky ideas. They are usually the most newsworthy.

Is solving all possible sudokus actually possible? I can see that making the tech news rounds. Though I suppose you could just solve them on-demand.

carbocation · on May 14, 2015

Norvig did it: http://norvig.com/sudoku.html

DaFranker · on May 14, 2015

Possible? Sure.

Brute-force solution is to consider all possible arrangements of numbers 1-9 in a 9x9 grid, filter that down to those that match the conditions of a completed grid, then iterate over the list of possible completed grid, for each grid find all possible arrangements of missing numbers, filter over those arrangements for those that can be solved, which means calculating whether there exists a solution from the available information, etc.

I didn't say efficient.

thfuran · on May 15, 2015

A cursory search indicates that there are 6670903752021072936960 distinct 9x9 sudoku. That's a whole lot of cycles and a bit of a storage issue.

sciencerobot · on May 14, 2015

Genome sequencing requires a lot of CPU power and disk space. You could build an application that performs these tasks and use the first $100k in compute power to help it grow. There is a lot of domain-specific knowledge in this field and I hear bioinformatics is difficult to monetize.

jordhy · on May 14, 2015

I suggest you take your algorithm and apply it to other niches. There might be good horizontal scalability there. For example why not try to crawl photography sites to discover popular photos and sort them by style or "taste"

You could then create your own photo discovery service and call it Find my Style or something like that.

There are two ways to do this, you can either directly scan the photos looking for graphical patterns or you could analyze the text into which the photo is posted.

chrischen · on May 14, 2015

I like this idea. Crawling art/photos online, classifying them, and otherwise mining the data would tie in well with my existing business, and also effectively convert $100k of computing into reusable stored assets.

mariopt · on May 14, 2015

Do something that matters: https://folding.stanford.edu/

Forbo · on May 15, 2015

In a similar vein: https://boinc.berkeley.edu/

pliny · on May 14, 2015

You can try turning 100k into 150k:

https://www.eff.org/awards/coop

fragsworth · on May 14, 2015

Unless OP finds some way to turn our understanding of prime numbers on its head, the best case in one year is turning the 100k into 50k by finding the first prime with 1,000,000 digits.

dm2 · on May 14, 2015

Is that prize still being offered?

I think it was claimed in 2000, "The $50,000 prize will go to Nayan Hajratwala of Plymouth, Michigan, a participant of the Great Internet Mersenne Prime Search (GIMPS), for the discovery of a two million digit prime number found using the collective power of tens of thousands of computers on the Entropia.com network."

https://www.eff.org/press/releases/big-prime-nets-big-prize

bentpins · on May 15, 2015

The article linked refers to the smallest of four prizes offered.

$50,000 to the first individual or group who discovers a prime number with at least 1,000,000 decimal digits (awarded Apr. 6, 2000)

$100,000 to the first individual or group who discovers a prime number with at least 10,000,000 decimal digits (awarded Oct. 22, 2009)

$150,000 to the first individual or group who discovers a prime number with at least 100,000,000 decimal digits

$250,000 to the first individual or group who discovers a prime number with at least 1,000,000,000 decimal digits

chrischen · on May 14, 2015

Unfortunately that prize has already been awarded :(.

dm2 · on May 14, 2015

Current largest prime: 2^57885161-1

So take random prime numbers larger than 57885161 (such as 57885167), and find a script that can calculate with numbers that high using EC2s server constraints, then see if 2^(large_prime_numb)-1 is prime. Is that the correct method of doing this?

https://www.eff.org/awards/coop/primeclaim-43112609

What are the stats on testing large prime numbers on EC2 instances?

utefan001 · on May 14, 2015

It would take a while to setup, but offensive infosec people need virtual networks to do pen testing against because they can't pen test all day every day against their own corporate network. Let the user choose a single machine or group of machines to try to compromise. Go for low cost to attract more users or go after big corporate training budgets a la sans.org

Put me in a lab of vulnerable servers and I will spend all weekend trying to get a Admin / root shell and learn way more than any books would have taught me.

BTW, computer security is sort of taking off on indeed.com http://www.indeed.com/jobtrends?q=metasploit&l=&relative=1

snyff · on May 14, 2015

Amazon is not super keen on people pentesting from or against their infrastructure.

You will be better using labs available as ISO or VM.

utefan001 · on May 14, 2015

All pen testing labs I have seen are VMs

luke-stanley · on May 15, 2015

You could potentially help millions of people by computing bite size versions of Wikipedia English each month by running Pagerank on it, then rendering a list of the top articles to HTML, and encoding in some way that's quickly decompressable on mobile.

Network connections are often slow but Wikipedia English is often really handy but too big for lots of people to store offline.

I have some code based on Sean Harnett's work here: https://github.com/lukestanley/wiki_pagerank

Additionally Wikipedia has awesome stats on pageviews that need crunching - there is a wealth of cultural, zeitgeist info that can be parsed, and used to priorotise with more than Pagerank.

cothomps · on May 15, 2015

Use it to build web scrapers hunting for AWS Keys on GitHub, to spin up more instances to scrape for AWS keys to..

zuccs · on May 15, 2015

AWSception

mikedemarais · on May 14, 2015

use it to help the world, throw up a few thousand Tor relays/bridges.

ecesena · on May 14, 2015

You could make something that let other people spawn instances in your cloud (e.g. via Cloud Formation). For instance, take <random open source project not available in AWS>, and let people easily spawn clusters of that. In this way you can run your beta phase for free, test, then set the price to the public.

chrischen · on May 14, 2015

One idea I had was to turn http://www.arthunted.com into a service. So you can specify parameters and sites to crawl and gather data, on demand. $5 a day times xxxx users.

boomzilla · on May 14, 2015

scrapinghub seems to be the top dog in this space. You might want to check it out.

What's your background? Are you looking for business idea that can be built upon your existing site, or are you looking for something completely new?

chrischen · on May 14, 2015

It doesn't have to be a business idea! Just an interesting idea that would otherwise require a lot of compute or storage, or something else provided by AWS.

sinsanati456 · on May 17, 2015

chrischen. I am also facing similar situation. So, which out of all the suggestions did you found the most interesting one ? And what have you actually planned doing with it ?

zzzttt · on May 15, 2015

WHOA. This is like google-scale computing power... AI for sure. Let me do some research, but here are some ideas:

disclaimer: these are a bit of a stretch to be sure, but hey

Never ending learner: Part 1: Machine vision on video + language learning on web + pattern recognition => new semantic constructs Part 2: semantic constructs + reinforcement learning + intrinsic exploration => action selection Result: An agent that can take in real-world scenarios, including language, text, and vision, and learn to do tasks by itself, improving how it learns as it goes. EPIC.

Autonomous programmer: code with comments (from github, community-based labeling, etc) + machine learning + knowledge representation => "understanding code" machine + natural language => PROFIT Result: you can say, "what's the sales this quarter?" and it'll deduce logical steps (parse, read from db, etc) and tell you the answer

braum · on May 14, 2015

do a better version of iThenticate which helps to prevent and find plagiarism in published content. We use it to help verify that content someone has sent to us is unique and is not just copy pasted in part or whole. It also helps us find uses of our published content and course material that has been re-purposed or copied verbatim. Entry level price point for iThenticate is about $5k per year. And unfortunately copyscape.com is not the same as iThenticate.

jdorward · on May 14, 2015

some considerations

- budget for bandwidth... esp if you are doing something more than text crud and want to serve it - e.g. image/video

- video is "heavy" and so necessarily takes a lot of compute, ram and storage and can also leverage gpu and pricier instances, depending on what you are trying to do... e.g. understanding video content with opencv/opencl or specific types of drawing like raytracing

- instance limit increases with aws are perfunctory... so i wouldn't consider 40 a limit, certainly don't design anything interesting with that limit in mind

- spot instances can save you a lot and stretch that $100K 1.5x - 4x depending on region, availability zone, and instance type

- unless you've done it before, time is your enemy to get into position to spend that money on something useful... so your monthly budget target range makes sense

samstave · on May 14, 2015

Go talk to the guys at ClusterK.com for their balancer device. It will logicall auto-balance your spawn of spot instances across many AZs and make you very resilient. Tell them you want to prove out their balancer.

Do this, because intelligently following the cheapest spot price will save you 90% of the typical costs.

THis will make your 100K be able to support a ton more instances than you would otherwise.

Make sure that whatever you do though is not chatty between the nodes - if you have a ton of instances talking across zones the data transfer fee can be significant.

Make sure that you store large data on an EBS that an instance mounts to prevent large transfer fees between instances and S3.

All instance limits can be raised. The only hard limit in AWS is 100 S3 buckets per account.

Just put in a limit increase request via the console. Email gilleyt@amazon.com if you have issues.

fasteo · on May 14, 2015

>>> I got $100k in AWS Credits

May I ask how ?

chrischen · on May 14, 2015

Its a credit given to companies backed by an accelerator.

anigbrowl · on May 14, 2015

Even ones that can't think of anything to do with it? I wish I had these problems :-/

dm2 · on May 14, 2015

It's given out like candy, if not the 100k one you can surely get several thousands worth of credits for server hosting, either on Azure or through Amazon.

pc86 · on May 14, 2015

Can confirm - with my MSDN license I get a $150/mo credit for any development/test instances. As long as I have a valid license (and they offer the deal) I'll get the credit.

bwb · on May 14, 2015

which one?

zerni · on May 14, 2015

I know Startupbootcamp has these deals for their batches. A similar deal is available for Google business users.

api · on May 14, 2015

Look up some of the prime factorization prizes and do the math to see if it is achievable.

sidarok · on May 21, 2015

I would invest it in 5 early stage Startups - 20K computing time each.

VikingCoder · on May 14, 2015

Well, if you're literally not going to use it - and you'll lose it at the end of a year... Can you donate it to Folding@Home or something?

I'm guessing AWS wouldn't want you to...

th0br0 · on May 14, 2015

why don't you use the credits to try to create a publicly queryable index of the web in a standardised format? read: open source search engine. as you've got the money, just ignore efficiency.

else... commit part of it to one of the computing @ home proejcts?

Smerity · on May 14, 2015

If you're interested in a publicly queryable index of the web, you could try running a search server such as ElasticSearch on the Common Crawl[1] corpus. ElasticSearch runs the search backend of WordPress, 600 million+ documents in total[2], so extending it to a Common Crawl archive seems possible.

n.b. I'm a data scientist at Common Crawl, so have a vested interest!

Also, whatever experiment you end up pursuing, remember to use spot instances if your setup allows for transient nodes - it'll substantially decrease your burn rate (usually 1/10th the price) allowing for even larger and more insane experiments :)

[1]: http://commoncrawl.org/

[2]: http://gibrown.com/2014/01/09/scaling-elasticsearch-part-1-o...

Eridrus · on May 14, 2015

I had a crawling project where I wanted to get a sense for a few ad-related things on the internet and came upon common crawl and was initially excited since I thought it would have incidentally captured the data I wanted, but I was disappointed to find that they did not do any kind of JS execution, which limited the effectiveness for me pretty drastically.

th0br0 · on May 14, 2015

I'd never heard of Common Crawl before but it looks like an awesome project! Keep up the good work!

chrischen · on May 15, 2015

How up-to-date is commoncrawl data?

thruflo · on May 14, 2015

The idea we had at my work was to setup a web service that just puts urls in your s3 bucket. Nothing more, nothing less.

Filepicker's storeUrl function without the baggage. Designed for server side apps that prefer to avoid streaming / download-uploading files locally in order to get them into a bucket.

Non trivial compute resource value add that would take very little time to code. Low risk for adopter: their files are stored in their bucket.

coderzach · on May 14, 2015

Whatever you do, only use spot instances.

automathematics · on May 14, 2015

I assume you had two options, 100k for a year or 10k for 2 years (I think)? (I've seen these offers)

Is that the case? And if so, why not take the other option?

chrischen · on May 14, 2015

$100k is so much bigger number.

automathematics · on May 14, 2015

Sure but it's like saying "here's a meal for 20, you have to eat it all today... or I'll feed you and only you for a year"

heeton · on May 14, 2015

No, it's like saying "here's $20, you have to use it all today. Or I'll give you $4, half today, half tomorrow".

fragmede · on May 14, 2015

Take the meal for 20 and host an improtu banquet. Charge a per-person fee and make some money.

AustinDizzy · on May 15, 2015

I'd say do this http://lg.io/2015/04/12/run-your-own-high-end-cloud-gaming-s... and make it super affordable with generous free plans.

tarequeh · on May 14, 2015

You could build a video conversion website where one uploads original high resolution video and it'd spit out 1080p/720p/320p and other formats of video that are suitable for delivering on different devices/bandwidth. This could be an alternative for people hosting video on youtube and getting slapped with ads. An effort like this would use a lot of CPU but once it's converted it's just the storage cost. Common challenges are copyright issue but I can see different ways to promote it as a professional service than collection of random videos. $100K would cover cost of offering it for free to people for a limited period.

developer1 · on May 15, 2015

As strange as it sounds, that much bandwidth would probably suck up $100k faster than you'd imagine. There's a reason a lot of large companies that deal in video have their own data centres and crazy bandwidth deals that make it cheap. I don't think $100k at Amazon's prices would last more than a few months if the service became popular.

kgrin · on May 14, 2015

That sounds pretty much identical to Zencoder, encoding.com, etc.

gesman · on May 14, 2015

>> ... no arbitrage, reselling, bitcoin mining

You just scrapped my first 3 ideas :)

gesman · on May 14, 2015

Actually it's interesting task to come up with creative way to convert amazon credit to hard cash (that does not expire).

I don't think bitcoin mining via CPU (virtualized especially) is feasible for that anymore.

waterlesscloud · on May 14, 2015

Altcoin mining on the amazon gpu machines was still viable, last I checked.

n4ru · on May 21, 2015

Well, it WAS, until Scrypt ASICs came out about a year ago. Now it's not viable. You'd be lucky to get a percent back.

tlrobinson · on May 14, 2015

Would running a PaaS on top of EC2 be considered "reselling"?

chrischen · on May 14, 2015

Only if it is purely a subset of EC2.

c4urself · on May 14, 2015

Do what you like -- but reserve instances so you can do it for longer.

cldellow · on May 14, 2015

Sadly,

> You may not use Promotional Credit for any fees or charges for Reserved Instances, Amazon Mechanical Turk, AWS Support, AWS Marketplace, Amazon Route 53 domain name registration, any upfront fee for any Service

[1]: https://aws.amazon.com/awscredits/

sp332 · on May 14, 2015

Artificial intelligence? It needs huge datasets and lots of CPU to train. Then you could put it to work, maybe captioning video or something.

joshu · on May 15, 2015

Solve chess. Write a paper: "White wins."

larrys · on May 14, 2015

I will let others answer "how should I use it" what I want to know is how did you get 100k in AWS credits.

abhishek0 · on May 15, 2015

You could experiment with servers farms and EC2 spot instances to figure out strategies for minimizing costs of maintaining huge server farms. Once learnt, you can sell that skill to multiple companies and with that money gained just do your next thing like giving t charity, having a beer etc..

sp332 · on May 14, 2015

Host art projects. Let people render fractals. Render video walkthroughs of highly detailed scenes.

logfromblammo · on May 14, 2015

Build an artificial life program, and use it to discover and optimize algorithms or electronic circuits relating to a problem that interests you.

If you don't have a problem of your own, build a better internal power supply for consumer electronics devices.

wepple · on May 20, 2015

Build fuzzing infrastructure, to find bugs that people will pay for (example: google chrome). There's a revenue stream, interesting technical challenges, and you're helping raise the security bar.

snehesht · on May 22, 2015

For those who want's to know what this 100k deal is...

http://aws.amazon.com/activate/benefits/

justaman · on May 14, 2015

http://en.wikipedia.org/wiki/List_of_distributed_computing_p...

cweagans · on May 14, 2015

Put the computer power toward some good cause -- Folding@home or similar.

beckler · on May 14, 2015

3d render farm?

nomel · on May 14, 2015

He's a bit limited on instances though.

Still, could find some animation studios, offer them your farm for some % of profits.

Pretty big gamble though.

nstart · on May 17, 2015

For those wondering how he got this much in credits:

http://aws.amazon.com/activate/

JckFr0st · on May 14, 2015

Would it be possible to create a site that shows in near real-time what images are being shared or are popular at any point of time?

nathan_f77 · on May 14, 2015

We also got the $100k AWS credits for our startup. We're going to be using it to pay for our server and CDN costs :)

chrischen · on May 15, 2015

Well by the time I got it, I had already paid for reserved instances...

goalo · on May 15, 2015

How did you manage to get that much credit?

hgfischer · on May 15, 2015

But how did you manage to get that much??

mandeepj · on May 14, 2015

would you mind sharing some details about how you are scraping?

I am also trying to build a crawler. The problem is each site has it's own html structure. How do you handle this? Have you written scraping rules for each site? Which is a nightmare to maintain especially when you have lot of sites to crawl.

Eridrus · on May 14, 2015

I think you want to have a system that can use XPath or CSS queries to select the elements you want.

This way writing a scraper for a given page is almost as easy as right clicking on an element in dev tools and selecting "Copy as XPath" for what you want.

You definitely need some validation that your scraper is still returning accurate results, so that you can get notified when things go wrong. Things like following links from an item to the item's product page and comparing scraped prices, names & images should get you a lot of the way.

At some point this will definitely get unwieldy, and you can try to build a more general solution that can understand grids or layout, but despite my preference for this as both a shopper of long tail sites and a developer, this is probably not where you want to start unless the long tail is your actual niche.

mandeepj · on May 15, 2015

You recommend staying specific to few categories instead of crawling over everything available on internet? We are starting only with women's clothing.

Eridrus · on May 15, 2015

I was more referencing the typical approach that people took of supporting the top N most popular sites and increasing N as they got bigger.

It's a solid approach for hitting the majority of the market, and works fine for alerting, but this leaves a pretty big gap in the market for people who are interested in comparison shopping for more boutique items, e.g. designer male fashion get sold by piles of different boutiques, each with their own sales, etc, but the items are exactly the same, and I would really like to know when something I am interested in goes on sale at one of the 50 different stores that have this item, and I would like to know this only when they go on sale in my size, and whether it's actually cheap after currency conversion and shipping. A person can dream, right?

Shit, I would love it if there was a platform that could guess my size across various items in different brands.

I've thought of this space a bit since I buy a decent amount of clothes, but I've never gone ahead and tried to execute.

onion2k · on May 14, 2015

If you're building a general purpose crawler, use a regexp to select the content of the body tag, then strip out all the tags inside it. You'll be left with a long string of words that you can then index... Tags, generally speaking, are unimportant if you're not rendering the content.

Of course you might want to leave some tags in, like links and titles. They convey more than just layout.

mandeepj · on May 14, 2015

Thanks for your reply. I am building a price comparison\alert engine so I am interested in product description, price, images or anything else closely related.

dwynings · on May 14, 2015

Shameless plug: You might be interested in checking out Diffbot (http://www.diffbot.com/). That use case is exactly what it was built for.

chrischen · on May 14, 2015

I just used scrapy. It lets you query using xpath or css selectors.

PietKachelhout · on May 15, 2015

How about not using it because AWS doesn't run on 100% renewable energy. Make a $100.000 statement!

danielcgold · on May 14, 2015

You could build a A middle-out compression algorithm which would make data storage problems smaller.

goalo · on May 15, 2015

I would suggest you buy servers in reserved contract and sell it on to others.

pearjuice · on May 14, 2015

If someone would have an idea how to get value out of $100K in AWS credits, would you think they would tell you? Why wouldn't they take the idea, pitch it, get $100K in AWS credits and then double it themselves?

jliptzin · on May 14, 2015

Because he's not asking for ways of generating more than $100k profit out of his $100k AWS credits which would be quite difficult. He's in the unique situation of having $100k in free AWS credits and would probably be happy getting back half that as cash.

chrischen · on May 14, 2015

Well I'm less interested in converting it to cash, as I could easily just start reselling the credits.

I'm trying to do some hacky project that would make a big splash... as there's $100k of value to be consumed.

jliptzin · on May 15, 2015

Out of curiosity how'd you get the credits?

sinsanati456 · on May 17, 2015

Any idea, where you could resell the credits ?

dm2 · on May 14, 2015

Because humans are many times irrational and lazy.

bbcbasic · on May 16, 2015

In a word: Risk

findthenag · on May 14, 2015

Open up the machines and post credentials here :-)

halis · on May 15, 2015

Do something with video. That could suck up $100k

rendambathu · on May 15, 2015

Jarvis, of course

rememberlenny · on May 14, 2015

Why can't you mine bitcoin?

ecesena · on May 14, 2015

If you do the math, you won't get much more than 4-5k$ in bitcoin...

_vmve · on May 14, 2015

As above, but also mining crypto's is against AWS terms of service.

MichaelGG · on May 14, 2015

Citation please. I just searched the AWS ToS, AUP, and customer agreement and found no such restriction.

Or do you mean that the startup $100K credit is restricted from obvious reselling/mining/anything that's not doing a value add? Cause of course AWS doesn't want to give away money, but promote startups to actually use their services.

nomel · on May 14, 2015

Why is that? Are they oversubscribed?

Retric · on May 14, 2015

Because that only makes ~1,000$ or less which seems like a huge waste.

Thetawaves · on May 14, 2015

Porn Site.

eigenbom · on May 14, 2015

Host a massive minecraft server.

cranklin · on May 14, 2015

gpu instances -> bitcoin mining

imaginenore · on May 14, 2015

DDoS a small country. Even if for a day.

bra-ket · on May 14, 2015

train a convnet to recognize 80M tiny images: http://groups.csail.mit.edu/vision/TinyImages/

dreamdu5t · on May 17, 2015

Resell it. Turn it into real money.

galactus · on May 14, 2015

mine dogecoin? (j/k)

bbcbasic · on May 14, 2015

The altcoin ideas are getting downvoted, but with the right alt coin he could make $10k+ in the bank to keep.

Anything related to creating an ongoing service whether for love or profit, is madness, because in 12 months that service has to stop unless the OP has $100k to spare to keep it going for the next year.

sinsanati456 · on May 17, 2015

What are the right alt coins that your hinting at ?

jhildings · on May 14, 2015

How about an altcoin miner cluster ? :D

skizm · on May 14, 2015

crypto-currency mining? You could burn through that very quickly with enough machines.