More

berkay · on May 10, 2024

In Great Automatic Grammatizator by Roald Dahl was one of the stories that stayed with me through the years. In the story, there is a machine that can write stories, replacing authors but they were at least paying real authors to use their names. https://en.wikipedia.org/wiki/The_Great_Automatic_Grammatiza...

berkay · on Nov 18, 2022

Foresight does this if you're using github actions for CI. https://www.runforesight.com/test-gap-analysis (disclaimer: one of the founders)

berkay · on Oct 3, 2022

@cebert Have you looked at us serverless specific solutions like Thundra (my company), Serverless.com, etc.? I think the cost for use case may be order of magnitude lower since the pricing is only based on number of invocations.

berkay · on Oct 29, 2021

What if it was easy to see why a test is flaky, compare failed/successful test runs like a code diff? Would that be useful? This is what we're building at Thundra (foresight product), instrument the tests as well as backend services to enable devs to quickly diagnose failing/flaky tests. Would appreciate any feedback you may have, here or privately.

ncmncm · on Oct 29, 2021

It would be helpful to be able to present diffs of log output between successful and failing runs of a test.

This is tricky to implement, for several reasons.

Log output is normally timestamped, making every line unique. Those parts of log lines would need to be ignored when comparing between runs.

Log output ordering is often indeterminate, particularly when a test has multiple threads, or interacts with an external service. Often the order of events logged is an essential feature of the difference between a successful and failed run. But some or most order differences are just incidental. The number of logged events may vary incidentally, or significantly. Explaining all these differences in detail to the test system would be too hard. So, the system needs to discover as much as possible of this for itself, and represent these discoveries symbolically. Then, allow a test to be annotated to override default judgments about the diagnostic significance of these features.

berkay · on June 22, 2021

This happened to me as well. I wanted to share how we dealt with it was it was different from most comments here and may be useful to you. We paid. My company was sued along with few multi-billion dollar companies. I only found out because one of their lawyers called me as they were planning to fight back and wanted to check whether I wanted to join forces.

I found a good patent attorney and explained the situation. After learning that we were a small company with less that $500K revenue that year, bootstrapped and only had couple hundred thousand dollars in the bank, he suggested that I talked to the patent trolls lawyers by myself without a lawyer first. His rational was that they were not really after us as we were small, and once they learned that, they would either drop it or ask for a small amount to settle.

I followed his advice and had a call with them and it went exactly as he predicted. Upon learning that our US revenue was only few hundred thousand dollars, they asked $30K to settle and sent me an agreement. He reviewed the agreement for us, and recommended that we settle. We did. It was a ridiculous patent yet all of the companies that were sued settled as well. The experience was traumatic and was one the main reasons I decided to raise money from investors later. I slept better knowing that we had the funds to fight lawsuits if need be. Patent lawyer was happy to take the case if we needed to, but warned us about how expensive it would be.

Given that you're a single person company, you're likely small fish for patent troll. The best option for you may be to let them know that and see what they do. Best of luck!

berkay · on June 7, 2021

Good point. There are also some software solutions (picus security, etc.) that tests your validates whether your environment is exposed due to specific CVEs. It's a good way to prioritize which vulnerabilities that you should tackle first.

berkay · on Jan 4, 2021

"Apache 2.0 with Commons Clause"

They are free to choose whatever license works for them but this reference to Apache 2.0 is problematic.

gls2ro · on Jan 5, 2021

Why is it problematic to reference Apache 2.0?

I am curios as I read some resources about various OSS licenses and after I was thinking that Apache 2.0 license is a good one.

detaro · on Jan 5, 2021

It's not problematic to reference Apache 2.0, it's a fine license. I assume what berkay is talking about is this wording of "<existing license> with Commons Clause" - e.g. the Apache foundation has explicitly requested that this not be done (which the creators of the Commons Clause conveniently ignore), because it changes what the license means a lot and sounds like it's an official variant of the license. Apache only wants if any "Apache 2.0 with addons" licenses if they give more rights to the recipient, not less.

gls2ro · on Jan 5, 2021

ah got it. thanks for the clarity.

I remember that I read some things about mixing Apache 2.0 with other restrictions (like Commons Clause in this context) and a lot of debates about the wording of such mix.

berkay · on March 30, 2018

Analogy does not work here and is misleading. You cannot do much if anything with 20 most representative pixels (if there is such a thing) but you can infer highly valuable characteristics about the person. Yes, you cannot recreate the original data but what you end up is potentially much worse (sensitive/private) than the original data.

darawk · on March 30, 2018

That's not really true, and is kind of a fundamental misunderstanding of how these things work.

wybiral · on March 31, 2018

Unless the data is completely random it's not crazy to say that the data can be reconstructed from a reduced version.

If you have a million points that largely fall on a 3-dimensional line and you project that into 2 dimensions, you can easily recover that lost dimension with losses relative to the deviation. And that loss may not even matter depending on the kinds of data and margins of error you're working in.

makomk · on March 31, 2018

This is actually a nice illustration of the central problem with this argument: the more personally identifiable a piece of information is, the less recoverable it'll be, and vice-versa. If all of the points of data are on some n-dimensional line, then obviously all of them can easily be recovered, but knowing all those things about a person doesn't actually tell you any more about them than knowing just one of those things. Conversely, if the points of data are very random then it'll only require a handful of points to uniquely identify a person and find the entry in the original data set with all their other information, but dimensionality reduction will have to throw that data away - you simply won't be able to recover that information from the model. (We actually know from the literature on de-anonymization that a lot of data falls into the second category.)

darawk · on March 31, 2018

Except that that toy example bears no resemblance to the actual situation.

wybiral · on March 31, 2018

How many dimensions were they working with and how much variance and correlation was there in the features? What's the margin of error for the end product?

darawk · on March 31, 2018

I don't know precisely, but it's pretty obvious that there'd be no way to reconstruct personally identifying info from it.

berkay · on March 30, 2018

Yes it's very common to rent from a company rather than individual landlords. Some of these companies are massive managing hundreds of thousands apartments. Many of the building are specifically built as rentals and all units are for rent hence there is often a leasing office somewhere in the building.

AnssiH · on March 30, 2018

OK. Here such companies just have one central office per city/region, no building-specific ones. The largest companies here manage "only" tens of thousands of rental apartments, though.

d1zzy · on March 30, 2018

How are they handling visiting the grounds/future apartments for interested renters? That's one of the main duties of the leasing office people in the US. When looking to rent one visits many such places and asks the leasing office people various questions while doing so.

morsch · on March 30, 2018

Typically by making an appointment.

AnssiH · on March 30, 2018

Scheduled public presentations, or appointments.

berkay · on March 23, 2018

You can select a method and see the logs generated for that method. Idea is to have all the information - request parameters, return values, logs, metrics all in one place to make it easier to troubleshoot problems.