const Host = require('@scramjet/core'); // Create a Scramjet "Browser" instance (the Host) const host = new Host();
| Feature | Puppeteer/Playwright | Apache Spark | | | :--- | :--- | :--- | :--- | | Primary Use | Browser Automation | Big Data Batch | Real-time Streaming | | Resource Use | Very High (Spins up Chromium) | High (JVM overhead) | Very Low (Pure Node.js) | | Learning Curve | Moderate | Steep (Scala/Python) | Low (Plain JavaScript) | | Speed (Data Ops) | Slow (Renders visuals) | Fast (Distributed) | Hypersonic (Streaming) | | Headless? | Yes (Full engine) | N/A | Yes (Minimal engine) | scramjet browser
In the world of DataOps and Cloud Computing, a "Headless Browser" is a browser without a user interface (e.g., Puppeteer or Playwright). The is a massive leap beyond the headless browser. It is a multi-threaded, stream-processing engine designed to run at the server level. const Host = require('@scramjet/core'); // Create a Scramjet
But what if the browser wasn't a stage? What if it was a high-speed data pipeline? It is a multi-threaded, stream-processing engine designed to
main();
async function main() // The "from()" method starts a stream of data await host .from([1, 2, 3, 4, 5]) // Simulate 5 pages .map(page => https://example.com/page/$page ) // Build URLs .flatMap(async (url) => fetch(url).then(res => res.text())) // Fetch HTML .map(html => html.match(/<img src="(.*?)"/g)) // Regex images .filter(Boolean) // Remove empty results .reduce((acc, images) => [...acc, ...images], []) // Combine .toArray() // Wait for result .then(console.log); // Output all image URLs