· Shah Zangeneh · Engineering · 6 min read
Building a Publix BOGO Alert Bot with Python and Telegram
How a web scraper 'Just for fun' project turned into a Telegram bot that scrapes, filters, and delivers personalized alerts to multiple people automatically.

Publix runs BOGO (buy one, get one) deals every week. If you shop there regularly, the deals are genuinely worth checking — but the Publix website doesn’t have a filter for what you actually care about. You either scroll through everything or you miss something.
I initially started this as a data mining practice project where I would weekly scrape the site and save all BOGO deals to a .csv file for future integration with an LLM to make a prediction program for Publix BOGO deals. I then had the idea of using the scraper to send SMS messages to myself whenever something on the BOGO list matched items I always want or need.
I looked at the Publix website structure and figured out that there was an accessibility-compliant version of the BOGO page that listed items in a simple HTML format. So I built a scraper with Python’s requests and BeautifulSoup libraries. The script scraped the page, filtered against a hardcoded list of keywords, and sent matches via SMS using Yahoo’s email-to-SMS gateway. It worked, mostly. But it had problems: the gateway was unreliable, the credentials were in the source, and if I wanted to share it with someone else I had to copy the script, swap the phone number, change the list of keywords, and maintain two versions. The last straw was that a little less than a year ago Publix stopped putting the BOGO deals in an accessibility (easier to scrape) page. I had to make a change.
This is the story of turning that script into something cleaner and more versatile.
Why the SMS approach had to go
The email-to-SMS gateway approach is one of those things that feels clever until it isn’t. You send an email to 5551234567@tmomail.net and T-Mobile delivers it as a text. Free, no API keys, no accounts. I used it with AT&T, Verizon, and T-Mobile and all had inconsistent results of actually delivering the SMS.
The problems:
- Spam Detection Initially the messages were flagged as spam since the application would fire many messages within milliseconds, so I had to build a random wait of between 10 to 30 seconds between each SMS message.
- Carrier-dependent. T-Mobile’s gateway is
tmomail.net. Verizon isvtext.com. AT&T istxt.att.net. You need to know which carrier each recipient uses, and carriers occasionally block or throttle these. - Unreliable delivery. Even when not blocked, messages arrive late, out of order, or not at all.
- No good path to multiple recipients. Adding a second person meant another keywords list, a second hardcoded number, and a second email call. Basically another Python file to create and run.
I looked at Twilio (reliable, but requires account verification and costs per message), WhatsApp (the unofficial libraries are against ToS), and Pushover (great for personal use, but $5/person).
Telegram won. Free, instant, official bot API, and recipients just need the app. The only friction is a one-time /start message to the bot — no opt-in codes, no sandbox numbers.
The scraper
The Publix BOGO page is a JavaScript-rendered single-page app, so requests + BeautifulSoup won’t cut it. Selenium it is.
The page loads BOGO items as <li> elements with IDs starting with bogo-. Products are lazy-loaded as you scroll, so the scraper needs to handle infinite scroll:
WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "li[id^='bogo-']"))
)
seen = set()
stagnant = 0
while True:
cards = driver.find_elements(By.CSS_SELECTOR, "li[id^='bogo-']")
new_found = 0
for card in cards:
title = card.find_element(By.CSS_SELECTOR, "div[data-qa-automation='prod-title']").text.strip()
if not title or title in seen:
continue
seen.add(title)
new_found += 1
# extract offer and validity...
driver.execute_script("window.scrollBy(0, window.innerHeight);")
time.sleep(3)
if new_found == 0:
stagnant += 1
if stagnant >= 3:
break
else:
stagnant = 0The stagnant counter is how we know we’ve hit the bottom — three consecutive scrolls with no new items, and we stop. A typical week pulls around 160–170 unique products.
Filtering
Raw keyword matching against product names gets you most of the way there, but some keywords need guardrails. A few examples from real data:
kelloggmatches cereals, but also Eggo waffles and Pop-Tarts — which I don’t wantpopcornmatches shrimp (Gorton’s Popcorn Shrimp), chicken (Publix Popcorn Fried Chicken), and Popcorners chips — none of which are popcornbertollimakes olive oil, vinegar, and pasta sauce — I only wanted the sauce
So there’s a small set of post-match filters:
def passes_special_filters(product):
p = product.lower()
if "kellogg" in p and "cereal" not in p:
return False
if "popcorn" in p and any(x in p for x in ["shrimp", "chicken", "popcorners"]):
return False
if "pasta" in p and "bowl" in p:
return False
if "bertolli" in p and "sauce" not in p:
return False
return TrueNot elegant, but it reflects real knowledge about how these products are named on the site. Years of weekly scrape data helped identify these edge cases.
The architecture
The project ended up as three files:
scraper.py— shared utilities: Selenium scraper, keyword filtering, Telegram sender, user managementbot.py— the Telegram bot, handles commands and runs the weekly scheduled scansend_alerts.py— a simple manual runner for triggering a scan outside the schedule
User data lives in users.json:
{
"555703745": {
"name": "Bob",
"store_id": "2500819",
"keywords": ["beer", "hummus", "pasta", "kefir"]
},
"55507917": {
"name": "Dave",
"store_id": "2705086",
"keywords": ["beer", "feta", "Greek yogurt", "cream cheese"]
}
}Each user has their own keyword list and store ID. When the weekly scan runs, users at the same store share a single scrape — no point hitting the Publix server twice for the same data.
The bot
python-telegram-bot handles the command routing. The commands users interact with:
| Command | What it does |
|---|---|
/start | Register and begin setup |
/findstore <zip> | Find nearby stores and their IDs |
/store <id> | Set your Publix store |
/add <item> | Add to your watch list (supports comma-separated) |
/remove <item> | Remove an item |
/list | Show your watch list and store |
/scan | Trigger a scrape right now |
/stop | Opt out and delete all your data |
Finding the store ID was a pain point worth solving. Publix doesn’t surface it in the URL — you have to look at network requests in DevTools. The /findstore command queries the Publix store locator API directly:
https://services.publix.com/storelocator/api/v1/stores/
?types=R,G,H,N,S&count=10&distance=20&zip=33000&isWebsite=trueThe response includes a weeklyAd.storeId field — distinct from the store’s display number — which is what the BOGO URL actually uses. The bot presents nearby stores with a one-tap /store <id> command to set it.
Weekly schedule
The bot uses python-telegram-bot’s built-in job queue (backed by APScheduler) to run the weekly scan automatically:
from zoneinfo import ZoneInfo
from datetime import time
eastern = ZoneInfo("America/New_York")
app.job_queue.run_daily(
weekly_scan,
time=time(14, 0, 0, tzinfo=eastern),
days=(3,), # Thursday
name="weekly_bogo_scan"
)The scraper runs in a thread so it doesn’t block the bot’s async event loop during the 2–3 minute scrape:
df = await asyncio.to_thread(get_bogo_deals, store_id)What I’d do differently
A few things I’d change if starting from scratch:
- Store the keyword lists differently. Right now special filter rules (the kellogg/popcorn carve-outs) are hardcoded. They should be per-user config in
users.json. - Persist the scrape results. Right now each scan is ephemeral. Storing results would let you diff week-over-week and only notify users about new deals.
- Add error reporting. If Selenium fails mid-scrape or Telegram rate-limits the bot, it fails silently. A simple admin notification would help.
The full project is on GitHub. It’s a practical script for a practical problem — nothing groundbreaking, but a good example of how a one-off scraper can grow into something you’d actually trust to run unattended. The latest iteration of the design and code was built with the help of Claude.