Ingest data from a Worker, and analyze using MotherDuck
In this tutorial, you will learn how to ingest clickstream data to a R2 bucket using Pipelines. You will use the Pipeline binding to send the clickstream data to the R2 bucket from your Worker. You will also learn how to connect the bucket to MotherDuck. You will then query the data using MotherDuck.
For this tutorial, you will build a landing page of an e-commerce website. A user can click on the view button to view the product details or click on the add to cart button to add the product to their cart.
- A MotherDuck โ account.
- Install Node.jsโ.
Node.js version manager
 Use a Node version manager like Volta โ or
nvm โ to avoid permission issues and change
Node.js versions. Wrangler, discussed
later in this guide, requires a Node version of 16.17.0 or later.
You will create a new Worker project that will use Static Assets to serve the HTML file.
Create a new Worker project by running the following commands:
npm create cloudflare@latest -- e-commerce-pipelinespnpm create cloudflare@latest e-commerce-pipelinesyarn create cloudflare e-commerce-pipelinesFor setup, select the following options:
- For What would you like to start with?, choose Hello World Starter.
- For Which template would you like to use?, choose Worker + Assets.
- For Which language do you want to use?, choose TypeScript.
- For Do you want to use git for version control?, choose Yes.
- For Do you want to deploy your application?, choose No(we will be making some changes before deploying).
Navigate to the e-commerce-pipelines directory:
cd e-commerce-pipelinesUsing Static Assets, you can serve the frontend of your application from your Worker. The above step creates a new Worker project with a default public/index.html file. Update the public/index.html file with the following HTML code:
Select to view the HTML code
<!DOCTYPE html><html>  <head>    <meta charset="utf-8" />    <title>E-commerce Store</title>    <script src="https://cdn.tailwindcss.com"></script>  </head>  <body>    <nav class="bg-gray-800 text-white p-4">      <div class="container mx-auto flex justify-between items-center">        <a href="/" class="text-xl font-bold"> E-Commerce Demo </a>        <div class="space-x-4 text-gray-800">          <a href="#">            <button class="border border-input bg-white h-10 px-4 py-2 rounded-md">Cart</button>          </a>          <a href="#">            <button class="border border-input bg-white h-10 px-4 py-2 rounded-md">Login</button>          </a>          <a href="#">            <button class="border border-input bg-white h-10 px-4 py-2 rounded-md">Signup</button>          </a>        </div>      </div>    </nav>    <div class="container mx-auto px-4 py-8">      <h1 class="text-3xl font-bold mb-6">Our Products</h1>      <div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6" id="products">        <!-- This section repeats for each product -->
          <!-- End of product section -->        </div>      </div>
      <script>        // demo products        const products = [          {            id: 1,            name: 'Smartphone X',            desc: 'Latest model with advanced features',            cost: 799,          },          {            id: 2,            name: 'Laptop Pro',            desc: 'High-performance laptop for professionals',            cost: 1299,          },          {            id: 3,            name: 'Wireless Earbuds',            desc: 'True wireless earbuds with noise cancellation',            cost: 149,          },          {            id: 4,            name: 'Smart Watch',            desc: 'Fitness tracker and smartwatch combo',            cost: 199,          },          {            id: 5,            name: '4K TV',            desc: 'Ultra HD smart TV with HDR',            cost: 599,          },          {            id: 6,            name: 'Gaming Console',            desc: 'Next-gen gaming system',            cost: 499,          },        ];
        // function to render products        function renderProducts() {          console.log('Rendering products...');          const productContainer = document.getElementById('products');          productContainer.innerHTML = ''; // Clear existing content          products.forEach((product) => {            const productElement = document.createElement('div');            productElement.classList.add('rounded-lg', 'border', 'bg-card', 'text-card-foreground', 'shadow-sm');            productElement.innerHTML = `            <div class="flex flex-col space-y-1.5 p-6">              <h2 class="text-2xl font-semibold leading-none tracking-tight">${product.name}</h2>            </div>            <div class="p-6 pt-0">              <p>${product.desc}</p>              <p class="font-bold mt-2">$${product.cost}</p>            </div>            <div class="flex items-center p-6 pt-0 flex justify-between">              <button class="border px-4 py-2 rounded-md" onclick="handleClick('product_view', ${product.id})" name="">View Details</button>              <button class="border px-4 py-2 rounded-md" onclick="handleClick('add_to_cart', ${product.id})">Add to Cart</button>            </div>          `;            productContainer.appendChild(productElement);          });        }        renderProducts();
        // function to handle click events        async function handleClick(action, id) {          console.log(`Clicked ${action} for product with id ${id}`);        }      </script>    </body>
</html>The above code does the following:
- Uses Tailwind CSS to style the page.
- Renders a list of products.
- Adds a button to view the details of a product.
- Adds a button to add a product to the cart.
- Contains a handleClickfunction to handle the click events. This function logs the action and the product ID. In the next steps, you will add the logic to send the click events to your pipeline.
You need to send clickstream data like the timestamp, user_id, session_id, and device_info to your pipeline. You can generate this data on the client side. Add the following function in the <script> tag in your public/index.html. This function gets the device information:
function extractDeviceInfo(userAgent) {  let browser = "Unknown";  let os = "Unknown";  let device = "Unknown";
  // Extract browser  if (userAgent.includes("Firefox")) {    browser = "Firefox";  } else if (userAgent.includes("Chrome")) {    browser = "Chrome";  } else if (userAgent.includes("Safari")) {    browser = "Safari";  } else if (userAgent.includes("Opera") || userAgent.includes("OPR")) {    browser = "Opera";  } else if (userAgent.includes("Edge")) {    browser = "Edge";  } else if (userAgent.includes("MSIE") || userAgent.includes("Trident/")) {    browser = "Internet Explorer";  }
  // Extract OS  if (userAgent.includes("Win")) {    os = "Windows";  } else if (userAgent.includes("Mac")) {    os = "MacOS";  } else if (userAgent.includes("Linux")) {    os = "Linux";  } else if (userAgent.includes("Android")) {    os = "Android";  } else if (userAgent.includes("iOS")) {    os = "iOS";  }
  // Extract device  const mobileKeywords = [    "Android",    "webOS",    "iPhone",    "iPad",    "iPod",    "BlackBerry",    "Windows Phone",  ];  device = mobileKeywords.some((keyword) => userAgent.includes(keyword))    ? "Mobile"    : "Desktop";
  return { browser, os, device };}Next, update the handleClick function to make a POST request to the /api/clickstream endpoint with the data:
async function handleClick(action, id) {  console.log(`Clicked ${action} for product with id ${id}`);
  const userAgent = window.navigator.userAgent;  const timestamp = new Date().toISOString();  const { browser, os, device } = extractDeviceInfo(userAgent);
  const data = {    timestamp,    session_id: "1234567890abcdef", // For production use a unique session ID    user_id: "user" + Math.floor(Math.random() * 1000), // For production use a unique user ID    event_data: {      event_id: Math.floor(Math.random() * 1000),      event_type: action,      page_url: window.location.href,      timestamp,      product_id: id,    },    device_info: {      browser,      os,      device,      userAgent,    },    referrer: document.referrer,  };  try {    const response = await fetch("/api/clickstream", {      method: "POST",      headers: {        "Content-Type": "application/json",      },      body: JSON.stringify({ data }),    });    if (!response.ok) {      throw new Error("Failed to send data to pipeline");    }  } catch (error) {    console.error("Error sending data to pipeline:", error);  }}The handleClick function does the following:
- Gets the device information using the extractDeviceInfofunction.
- Creates a POSTrequest to the/api/clickstreamendpoint with the data.
- Logs any errors that occur.
You will create a new R2 bucket to use as the sink for our pipeline. Create a new r2 bucket clickstream-data using the Wrangler CLI:
npx wrangler r2 bucket create clickstream-dataYou need to create a new pipeline and connect it to the R2 bucket you created in the previous step.
Create a new pipeline clickstream-pipeline using the Wrangler CLI:
npx wrangler pipelines create clickstream-pipeline --r2-bucket clickstream-data --compression none --batch-max-seconds 5When you run the command, you will be prompted to authorize Cloudflare Workers Pipelines to create R2 API tokens on your behalf. These tokens are required by your Pipeline. Your Pipeline uses these tokens when loading data into your bucket. You can approve the request through the browser link which will open automatically.
โ
 Successfully created Pipeline "clickstream-pipeline" with ID <PIPELINE_ID>
Id:    <PIPELINE_ID>Name:  clickstream-pipelineSources:  HTTP:    Endpoint:        https://<PIPELINE_ID>.pipelines.cloudflare.com    Authentication:  off    Format:          JSON  Worker:    Format:  JSONDestination:  Type:         R2  Bucket:       clickstream-data  Format:       newline-delimited JSON  Compression:  NONE  Batch hints:    Max bytes:     100 MB    Max duration:  300 seconds    Max records:   10,000,000
๐ You can now send data to your Pipeline!
Send data to your Pipeline's HTTP endpoint:
curl "https://<PIPELINE_ID>.pipelines.cloudflare.com" -d '[{"foo": "bar"}]'You have setup the frontend of your application to make a call to the /api/clickstream route, everytime the user clicks on one of the buttons. The application makes a call to the /api/clickstream endpoint to send the clickstream data to your pipeline. The /api/clickstream endpoint is handled by a Worker in the src/index.ts file.
You will use the pipelines binding to send the clickstream data to your pipeline. In your wrangler file, add the following bindings:
{  "pipelines": [    {      "binding": "PIPELINE",      "pipeline": "clickstream-pipeline"    }  ]}[[pipelines]]binding = "PIPELINE"pipeline = "clickstream-pipeline"Next, update the type in the worker-configuration.d.ts file. Add the following type:
interface Env {  PIPELINE: Pipeline;}Update the src/index.ts file to handle the POST request:
export default {  async fetch(request, env, ctx): Promise<Response> {    const pathname = new URL(request.url).pathname;    const method = request.method;    if (pathname === "/api/clickstream" && method === "POST") {      const body = (await request.json()) as { data: any };      try {        await env.PIPELINE.send([body.data]);        return new Response("OK", { status: 200 });      } catch (error) {        console.error(error);        return new Response("Internal Server Error", { status: 500 });      }    }    return new Response("Hello World!");  },} satisfies ExportedHandler<Env>;The src/index.ts file does the following:
- Checks if the request is a POSTrequest to the/api/clickstreamendpoint.
- Extracts the data from the request body.
- Sends the data to your Pipeline.
- Returns a response to the client.
You can run the local server to the test the application. However, the data will be not be sent to the production pipeline. You will need to deploy the application to send the data to the pipeline.
To start the development server execute the below command and naviagte to http://localhost:8787:
npm run devWhen you click on the View Details or the Add to Cart button, the handleClick function calls the /api/clickstream endpoint. This endpoint uses the pipelines bindinding and sends the clickstream data. Note that no actual data is sent to the pipeline when running the application locally.
To deploy the application, run the following command:
npm run deployThis will deploy the application to the Cloudflare Workers platform.
๐ Building list of assets...๐ Starting asset upload...๐ Found 1 new or modified static asset to upload. Proceeding with upload...+ /index.htmlUploaded 1 of 1 assetsโจ Success! Uploaded 1 file (2.37 sec)
Total Upload: 25.73 KiB / gzip: 6.17 KiBWorker Startup Time: 15 msUploaded e-commerce-clickstream (11.79 sec)Deployed e-commerce-clickstream triggers (7.60 sec)  https://<URL>.workers.devCurrent Version ID: <VERSION_ID>You can access the application at the URL provided in the output of the command. Now when you click on the View Details or Add to Cart button, the clickstream data will be sent to your pipeline.
Your application sends clickstream data to the R2 bucket using pipelines. In this step, you will connect the R2 bucket to MotherDuck.
You can connect the bucket to MotherDuck in several ways, which you can learn about from the MotherDuck documentation โ. In this tutorial, you will connect the bucket to MotherDuck using the MotherDuck dashboard.
Before connecting the bucket to MotherDuck, you need to obtain the Access Key ID and Secret Access Key for the R2 bucket. You can find the instructions to obtain the keys in the R2 API documentation.
- 
Log in to the MotherDuck dashboard and select your profile. 
- 
Navigate to the Secrets page. 
- 
Select the Add Secret button and enter the following information: - Secret Name: Clickstream pipeline
- Secret Type: Cloudflare R2
- Access Key ID: ACCESS_KEY_ID(replace with the Access Key ID)
- Secret Access Key: SECRET_ACCESS_KEY(replace with the Secret Access Key)
 
- Secret Name: 
- 
Select the Add Secret button to save the secret. 
In this step, you will query the data stored in the R2 bucket using MotherDuck.
- 
Navigate back to the MotherDuck dashboard and select the + icon to add a new Notebook. 
- 
Select the Add Cell button to add a new cell to the notebook. 
- 
In the cell, enter the following query and select the Run button to execute the query: 
SELECT * FROM read_json_auto('r2://<BUCKET_NAME>/**/*');Replace the <BUCKET_NAME> placeholder with the name of the R2 bucket.
The query will return the data stored in the R2 bucket.
You have successfully built an e-commerce application that leverages Cloudflare's Pipelines, R2, and Workers. Through this tutorial, you've gained hands-on experience in:
- Creating a Workers project with a static frontend
- Generating and capturing clickstream data
- Setting up a Cloudflare Pipelines to ingest data into R2
- Connecting your R2 bucket to MotherDuck for advanced querying capabilities
This project serves as a foundation for building scalable, data-driven applications. You can now expand on this knowledge to create more complex e-commerce features, implement advanced analytics, or explore other Cloudflare products to enhance your application further.
For your next steps, consider exploring more advanced querying techniques with MotherDuck, implementing real-time analytics, or integrating additional Cloudflare services to further optimize your application's performance and security.
You can find the source code of the application in the GitHub repository โ.
Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark