Development

API Testing with Fictitious Data: Complete Guide

API testing is fundamental in modern software development. Using valid fictitious data, such as algorithmically generated CPFs and CNPJs, ensures your endpoints are tested realistically without compromising real people's data.

Why use fictitious data in API testing?

When developing APIs that process Brazilian documents such as CPF, CNPJ, RG, and others, it's essential to validate that endpoints correctly accept and process this data. Using real data in testing is a violation of LGPD (Brazil's data protection law) and can expose sensitive third-party information.

Fictitious data generated by valid algorithms ensures that the API's validation logic works correctly while maintaining full compliance with data protection laws. Tools like CPF and CNPJ generators produce numbers that pass check digit validation, simulating real scenarios safely.

Strategies for API testing with Brazilian data

An effective approach is to create data factories that generate complete payloads for each endpoint. For example, a customer registration endpoint may require CPF, name, address, and phone number — all can be automatically generated with valid fictitious data.

Another important strategy is testing error scenarios: sending invalid CPFs, incorrectly formatted CNPJs, and documents with repeated digits. This validates that the API returns appropriate error codes (400 Bad Request) and clear messages to the consumer.

Automating tests with generation tools

Integrating fictitious data generators into automated testing pipelines is a recommended practice. Tools like help4.dev can be used to quickly generate valid CPFs, CNPJs, credit cards, and other documents to populate test databases.

In CI/CD pipelines, seed scripts can use these generators to create consistent data sets before each test execution. This eliminates dependency on static data and ensures each run is independent and reproducible.

Testing format and mask validations

Well-built APIs should accept documents both with and without masks. A CPF can be sent as '12345678901' or '123.456.789-01', and the API should normalize and validate both formats. Testing with fictitious data makes it easy to cover these variations.

Beyond format, it's important to test boundaries: documents with incorrect length, unexpected special characters, and empty values. Each scenario should return an appropriate response, and having a solid base of fictitious data makes creating these test cases straightforward.

Best practices and next steps

Keep your test data isolated from the production environment. Use environment variables to configure which data source to use in each environment. In development and staging, use fictitious data generators; in production, never expose test data.

Document the data formats accepted by each endpoint and create a test suite that covers both success and failure scenarios. By combining fictitious data generators with testing frameworks like Jest, Pytest, or Postman, you build robust and reliable test coverage.