You put a QR code on your box. You put another one on your insert card. Maybe a third on your thank-you sticker. Customers scan them. Orders come in. But you have no idea which QR code is actually bringing people back.
So you keep printing all three. Every order. Every month. Spending money on materials that might be doing nothing.
TL;DR: Most merchants count scans. What matters is purchases. Create one dynamic QR code per surface, give each a unique tracking link, and after 30 days check which one led to actual sales. Cut the ones that don't. Put more into the one that does.
Why Do Merchants Skip QR Code A/B Testing?
Because it feels hard. QR codes are printed on physical things. You can't change a box once it ships.
Here's the part most people miss: you don't need to reprint anything to run a test.
A dynamic QR code separates the printed code from the destination. The code on the box stays the same forever. But you control where it sends people. You can change the destination any time, from your phone, without touching the packaging.
So you print three different codes (one per surface), each pointing to a slightly different link. You track which link gets the most buyers. That's the whole test.
Do QR codes increase sales? Yes, when they lead somewhere worth going. Packaging QR codes get scanned about 10% of the time. Flyers? Around 3%. That's a real group of people. If you're not tracking what they do after the scan, you're guessing.
What You're Actually Testing
Most people test QR code design: color, size, whether to put a logo in the middle. That stuff matters a little.
Placement matters more.
Placement means: where on the order does the QR code appear?
Common spots to compare:
- Outer box or mailer bag (customer sees it before they even open the order)
- Insert card (tucked inside, they see it when unpacking)
- Thank-you card or sticker (attached to the product itself)
- Inner tissue or wrapping (less common, but used by premium brands)
Each spot hits the customer at a different moment. The box arrives and they're excited. The insert card is right there when they open it. The thank-you card comes after they've touched the product. Each moment puts the customer in a different headspace, and different headspaces lead to different actions.
You can't know which moment converts best without testing. For a deeper look at how each surface works as a marketing touchpoint, see packaging as a Shopify marketing channel.
How to Set Up the Test
Step 1: Create one dynamic QR code per surface
Do not use the same QR code on every surface. Each one needs to be separate so you can track it on its own.
Shopify does not include dynamic QR code tools or scan tracking. You'll need a separate app. HypeQR is built for this: create multiple codes, control where each one goes, and see how many people scan each one.
Make three codes. Name them clearly: "outer box," "insert card," "thank-you card." That's it for this step.
Step 2: Give each code a different tracking link
This is how you find out which QR code led to a purchase.
Each code points to a URL with a small tag on the end. The tag tells GA4 (Google Analytics) where the visitor came from. These tags are called UTM parameters. They look like this:
- Outer box:
yourstore.com/discount/WELCOME?utm_source=qr&utm_medium=packaging&utm_campaign=post_purchase&utm_content=outer_box - Insert card:
...&utm_content=insert_card - Thank-you card:
...&utm_content=thankyou_card
The only part that changes is utm_content. Everything else stays the same across all three links. This keeps your GA4 data clean and easy to compare.
Not set up in GA4 yet? The UTM tracking guide for Shopify walks through the full setup.
Step 3: Run the test for 30 days (or until each code gets 200 scans)
Don't check results after a week. Shipping takes different amounts of time. Some customers open packages right away. Others leave them on the counter for days.
30 days gives you a fair picture. If your store ships a lot of orders and each code hits 200 scans before 30 days, you can stop early. Under 200 scans per code means the results might just be random noise.
Step 4: Check which code led to the most purchases, not the most scans
Most people look at their QR tool, see which code got scanned the most, and call that the winner. That's the wrong move.
Scans are not sales.
You want to know which code led to people actually buying something.
In GA4: go to Reports > Acquisition > Traffic Acquisition. Filter by utm_content. Look at purchases and revenue for each code, not sessions.
Here's a real example of what results might look like:
| Placement | Scans | Purchases | Revenue |
|---|---|---|---|
| Outer box | 210 | 8 | $240 |
| Insert card | 140 | 19 | $570 |
| Thank-you card | 95 | 4 | $120 |
The insert card got fewer scans than the outer box. But it drove more than twice the purchases and more than twice the revenue. That's your winner, even though it "lost" on scan count.
HypeQR gives you per-code scan data and lets you update destination URLs without reprinting. See how it works →
How to Read the Results
After your test window, look at three things for each placement:
- Scan rate (scans ÷ orders shipped): Is the code easy to find and scan? Low rate means it's buried or easy to miss.
- Buyers per scan (purchases ÷ scans): Are the people who scan actually buying? Low rate means the page or offer isn't convincing them.
- Revenue per scan: Multiply the two above. This is the one number that tells you which placement is worth keeping.
Low scan rate: move the code somewhere more visible. Low buyers per scan: the problem is the landing page or the offer, not the placement.
These are different problems with different fixes. Knowing which one you have saves you from fixing the wrong thing.
What to Test Next
Once you know which placement wins, run your next test on the destination page.
Option A: Send them to a discount code. Fast reward, low friction, good for repeat purchases. Option B: Send them to a review or loyalty signup. Takes more from the customer, but builds long-term value.
Use the same tracking setup. Same test window. Same measurement.
After that, test the text printed next to the QR code:
- "Scan for 10% off your next order"
- "Scan to register your product"
- "Scan for a free gift"
Same code, same placement, different printed text. Different results. You'll be surprised which line wins.
Each test answers one question. After three or four tests, you'll know your best placement, your best offer, and your best call to action. For a proven setup that uses QR codes to collect reviews specifically, getting Shopify reviews with QR codes walks through the full flow.
What If Your Volume Is Too Low to Test?
If your store ships fewer than 100 orders a month, getting 200 scans per code will take a while. That's fine.
In that case, test one variable at a time and run longer windows: 60 or 90 days. Or skip the split test and focus on one placement first (insert card is a safe starting point). Get a baseline, then try a different placement.
You don't need to run all three codes at once. Even a single dynamic code with proper tracking tells you something useful. Start there.
The stores that actually grow from offline marketing are the ones that measure it. Everything else is guesswork printed on cardstock.

