davps's comments

davps · on Jan 31, 2023

This approach didn't use an ORM and run the tests concurrently against the same database.

I follow those steps on my pipeline:

Every time I commit changes the CI/CD pipeline follow those steps, on this order:

- I use sqitch for the database migration (my DB is postgresql).

- Run the migration script `sqitch deploy`. It runs only the items that hasn't been migrated yet.

- Run the `revert all` feature of sqitch to check if the revert action works well too.

- I run `sqitch deploy` again to test if the migration works well from scratch.

- After the schema migration has been applied, I run integration tests with Typescript and a test runner, which includes a mix of application tests and database tests too.

- If everything goes well, then it runs the migration script to the staging environment, and eventually it runs on the production database after a series of other steps on the pipeline.

I test my database queries from Typescript in this way:

-in practice I'm not strict on separating the tests from the database queries and the application code, instead, I test the layers as they are being developed, starting from simple inserts on the database, where I test my application CRUD functions that is being developed, plus to the fixtures generators (the code that generate synthetic data for my tests) and the deletion and test cleanup capabilities.

-having those boilerplate code, then I start testing the complex queries, and if a query is large enough (and assuming there are no performance penalties using CTE for those cases), I write my largue queries on small chunks on a cte, like this (replace SELECT 1 by your queries):

    export const sql_start = `
    WITH dummy_start AS (
        SELECT 1
    )

    export const step_2 = `${sql_start},
    step_2 AS (
        SELECT 1
    )
    `;

    export const step_3 = `${step_2},
    step_3 AS (
        SELECT 1
    )
    `;

    export const final_sql_query_to_use_in_app = ` ${step_3},
    final_sql_query_to_use_in_app AS(
        SELECT 1
    )

    SELECT \* FROM final_sql_query_to_use_in_app

`;

Then on my tests I can quickly pick any step of the CTE to test it

    import {step_2, step_3, final_sql_query_to_use_in_app} from './my-query';

    test('my test', async t => {

        //
        // here goes the code that load the fixtures (testing data) to the database
        //

        //this is one test, repeat for each step of your sql query
        const sql = `${step_3} 
            SELECT * FROM step_3 WHERE .....
        `;
        const {rows: myResult} = await db.query(sql, [myParam]);
        t.is(myResult.length, 3);

        //
        // here goes the code that cleanup the testing data created for this test
        // 
    });

and on my application, I just use the final query:

        import {final_sql_query_to_use_in_app} from './my-query';

        db.query(final_sql_query_to_use_in_app)

The tests start with an empty database (sqitch deploy just ran on it), then each test creates its own data fixtures (this is the more time consuming part of the test process) with UUIDs as synthetic data so I don't have conflicts between each test data, which makes it possible to run the tests concurrenlty, which is important to detect bugs on the queries too. Also, I include a cleanup process after each tests so after finishing the tests the database is empty of data again.

For sql queries that are critical pieces, I was be able to develop thounsands of automated tests with this approach and in addition to combinatorial approaches. In cases where a column of a view are basically a operation of states, if you write the logic in sql directly, you can test the combination of states from a spreadsheet (each colum is an state), and combining the states you can fill the expectations directly on the spreadsheet and give it to the test suites to run the scenarios and expectations by consuming the csv version of your spreadsheets.

If you are interested on more details just ping me, I'll be happy to share more about my approach.

WuxiFingerHold · on Feb 1, 2023

This is a beautiful and powerful approach (and great explanation), IMO. Personally, I'd keep all CTEs (your steps) as units and combine them separately. But you've commented on that in your other comment.

zabzonk · on Jan 31, 2023

how do you know that what your select statements return is correct? on a real database?

davps · on Feb 1, 2023

Yes, it always runs on a real database, without mocks, as integration tests where you can test each CTE's auxiliary statement separately, which acts as a step of our sql pipeline. So the initial data is inserted on the tables and then I can exercise the assertions on the sql queries or views against the real database. In theory it could be easy to rearrange the concatenations of the CTE strings from above so it can be tested as a unit when it's put together with the previous auxiliary step as a temporary table but I never had the need for that since the integrations are simpler and works really well for me. The essential part of my approach is to treat the sql code as concatenated pieces of strings, and call slices of that with just the right concatenation to exercise the test to that slice on the real database, which is valid since the query will always be valid.

There is another pattern too, when I implemented a RLS based multi-tenancy with RBAC support, which needed an relatively large sql codebase and needed to be battle tested because it was critical, I've splited a big part of the sql code in a lot of sql functions instead of views to test the code units or integrations (using something similar to dependency injection but for the data, to switch the tenant RBAC's contexts), because for the sql functions I can pass different Postgresql's Configuration Parameters to test different tenants for example.

davps · on Jan 12, 2023

Yes, I use it in that way. And if ChatGPT didn't generated the code with pure functions (usually it didn't), you can explicitly ask to generate the code with pure functions. Then ask to generate the tests.

Usually I get good tests from ChatGPT when I approach it as an iterative process, requesting multiple improvements to the generated test based on what it gives to me. Note that it doesn't replace the skillsets you need to know to write good test coverage.

For example, you can ask to generate integration tests instead of unit tests in case it need context. Providing details on how the testing code should be generated really helps. Also, asking to refactoring the code in preparation to make it testeable, for example, or requesting to convert some functions to actual pure functions, or requesting to refactor a piece of code it generated to a separate function. Then you ask to generate tests for normal and also for boundary conditions. The more specific you get, the chance of getting a good and extensive tests from it is much higher.

Having the tests and code generated by ChatGPT really helps to catch the subtle bugs it usually generates on the generated code (fixing it manually), usually I get test coverage that proof the robustness I needed for production code.

This approach still needs manual fine tuning of the generated code, I think ChatGPT still struggle to get the context right but in general, when it makes sense to use it, I'm more productive writing tests in this way than manually.

davps · on June 18, 2019

I can confirm, at Paraguay, only two small towns (Pilar y Ayolas) where affected and for a short period of time. Source: I live at Paraguay.

paulvs · on June 19, 2019

Nice to see Paraguay present here on HN, I live here in Asunción.

I also heard of some people near Luque without power Sunday and Monday, but I'd guess it was an unrelated ANDE incident.

davps · on June 18, 2019

Right now I'm reading this book about this topic: Power System Stability and Control by Prabha Kundur https://www.amazon.com/System-Stability-Control-Prabha-Kundu...

davps · on May 25, 2019

> Because a good DSL abstracts away things that are not relevant to the domain (e.g. in Excel, memory pointers and allocation) while retaining those that are (which data transformation you want to apply).

I agree, I wrote this small example on using DSL for E2E testing, where you abstract away a lot of asynchronous calls, headless browser specifics and other low level concepts to expose a simple API so domain experts can help you to write tests https://github.com/davps/tic-tac-toe/blob/master/src/App.tes...

davps · on Jan 1, 2015

Location: Paraguay, South America

Remote: Yes (and I can visit)

Willing to relocate: Yes, for the right opportunity.

Technologies: Expert in front-end development (Javascript). Also, with exposure in back-end development technologies based on node.js, java and .NET. Some of the technologies I enjoy using every day are HTML/CSS/SVG, Backbone, React, node.js, Vagrant, Linux, AWS, Git, Vim.

Résumé/CV: Available upon request.

Email: davidperez3010 at teh gmail

I'm a multidisciplinary technologist with exposure on software development and electrical engineering. Currently I remote work as Lead Developer on visualization technologies for an US based software company and I have experience on architecting and developing large scale commercial grade javascript applications that includes heavy use of SVG. I'm an active proponent for Test Driven Development, I like Protractor and at the same time I care about writing software designed for unit and integration tests with a good coverage.

I'll visit San Francisco, CA on the week of January 26, let's grab coffee on that week!

davps · on Oct 1, 2012

SEEKING WORK or FREELANCER - Remote (based in Paraguay - traveling is OK)

I am multidisciplinary technologist, graduate of electrical engineering with an emphasis on power systems and full stack web developer.

* JavaScript, jQuery, Backbone, Require, Easel

* HTML, CSS, less, Twitter Bootstrap

* Ruby, Ruby on Rails, Java (Spring 3)

* PostgreSQL, MySQL

* Heroku, GAE/J

More about me and my projects: http://davps.github.com/about.html

davps · on June 18, 2012

Power failures caused by lightning strike are relatively easy to test with platforms like RTDS [1] (I am not affiliated to RTDS).

I know that you can test in real time your electrical protection systems for almost all the possibilities you can imagine (thousands of them), for example: faults in your high voltage utility distribution system, breaker failures, coordination of the protection systems, lost of your back-up generator power. I don't know their systems or their philosophies, would be interesting for me to know why they don't parallelize groups of generators (at the backup system), so, when one generator fails, the power load are balanced to the others (and using well known schemes to avoid cascade failures).

[1] http://www.rtds.com/applications/applications.html