Reddit APISearchPythonPRAWTutorial2026

Reddit Search API Tutorial: Query Subreddits by Keyword in Python (2026)

Search Reddit posts by keyword in Python in 2026. Native /search.json, PRAW subreddit.search(), and a managed REST endpoint compared, with copy-paste code, parameters, and the 1,000-result cap explained.

RedditAPI·
Reddit Search API tutorial cover, an independent third-party guide to querying subreddits by keyword in Python with native search.json, PRAW, and a managed REST API

Not affiliated with Reddit Inc. redditapis.com is an independent third-party REST proxy for Reddit's API. This tutorial is vendor-neutral: it shows the native /search.json endpoint, the PRAW library, and a managed REST option side by side so you can pick what fits.


TL;DR: The Reddit search API is the public https://www.reddit.com/search.json endpoint. Pass q for your keyword, sort and t for ordering, and restrict_sr=1 to scope to one subreddit. It is free and needs no OAuth for public subreddits at 60 requests per minute, but it caps at 1,000 results per query and gets 403-blocked from many cloud IPs. PRAW wraps the same endpoint with OAuth at 100 req/min. For production search at scale, a managed REST endpoint like https://api.redditapis.com/api/reddit/search returns the same data through a clean IP pool for $0.002 per call.


What you will build:

  • A working keyword search against /search.json and /api/reddit/search in Python
  • Subreddit-restricted search with restrict_sr and the managed subreddit parameter
  • PRAW .search() with every valid sort and time_filter value
  • A pagination loop that walks past the first page using the after cursor

What Is the Reddit Search API

The Reddit search API is the public endpoint at https://www.reddit.com/search.json. You send a GET request with a q query parameter and Reddit returns a JSON listing of matching posts. Append /r/<subreddit>/search.json and add restrict_sr=1 to scope the search to a single community, or hit the bare /search.json to search across all of Reddit. The response is a standard Reddit listing object: a kind field plus a data.children array, each child wrapping a post with its title, subreddit, score, and permalink. For public subreddits the endpoint answers without OAuth at up to 60 requests per minute.

That is the whole contract. Everything else in this tutorial, including PRAW and the managed REST option, is a different wrapper around that same search behavior. Developers in r/redditdev ask the same question constantly: how do I search all of Reddit by keyword, not just one subreddit. The thread below is a clean example of the demand.

r/redditdev·u/[deleted]

Searching all Reddit posts with API

Hey guys! So I'm trying to do a normal Reddit search with API. There's a hiccup though: I can't find such an endpoint in Reddits API documentation. I did find this post:…

36
Open on Reddit

This guide does what most ranking pages skip. It hands you copy-paste Python for every path and explains where each one breaks, instead of pitching a product over a search box.


Search Endpoint Parameters: The Reference

Before the code, here is what every parameter does. These are the same query parameters whether you call the native endpoint directly or through a managed REST proxy.

Reddit search API parameter reference table: q, sort, t, restrict_sr, subreddit, limit, after with examples

The two parameters that trip people up are t and restrict_sr. The t time window only affects sort=top and sort=comments; it is ignored for relevance and new. And restrict_sr only works when you call a subreddit-scoped URL (/r/python/search.json), not the global /search.json. On the managed path you pass a subreddit parameter instead, which is less error-prone. The full parameter list lives in the Reddit Data API wiki and the managed API documentation.


Native /search.json in Python

The native path uses nothing but the requests library and a descriptive User-Agent. Here is a global keyword search across all of Reddit:

import requests

URL = "https://www.reddit.com/search.json"
HEADERS = {"User-Agent": "reddit-search-tutorial/1.0 (by u/your_username)"}

params = {
    "q": "python web scraping",
    "sort": "new",
    "limit": 25,
}

resp = requests.get(URL, params=params, headers=HEADERS, timeout=20)
resp.raise_for_status()
data = resp.json()

for child in data["data"]["children"]:
    post = child["data"]
    print(f"r/{post['subreddit']} | {post['score']:>5} | {post['title']}")

print("next page cursor:", data["data"].get("after"))

The response is a listing object. data.children is the array of posts, each wrapped in a {"kind": "t3", "data": {...}} envelope. The data.after field is the pagination cursor; pass it back as an after parameter to fetch the next page.

To scope to one subreddit, change the URL and add restrict_sr:

import requests

URL = "https://www.reddit.com/r/Python/search.json"
HEADERS = {"User-Agent": "reddit-search-tutorial/1.0 (by u/your_username)"}

params = {
    "q": "asyncio",
    "restrict_sr": 1,
    "sort": "top",
    "t": "year",
    "limit": 25,
}

resp = requests.get(URL, params=params, headers=HEADERS, timeout=20)
posts = resp.json()["data"]["children"]
print(f"Found {len(posts)} matches in r/Python")

A note on query syntax, because it changes your results more than people expect. Reddit's search supports field operators in the q string: title:keyword matches only the title, selftext:keyword matches the body, subreddit:python scopes without a separate URL, and author:username filters by poster. Combine them with boolean operators: q=title:fastapi AND selftext:async is a valid query. Quote multi-word phrases, q="background tasks", to match the phrase rather than the two words independently. URL-encoding is handled for you when you pass params as a dict to requests, so write the raw string and let the library encode it. These operators work identically on the native endpoint and through a managed proxy, since both forward the same q to Reddit's search index.

One honest warning, because it is the single most common failure in 2026. Reddit filters a large share of cloud and datacenter IP ranges. The code above runs perfectly from a laptop or a residential connection, but the same script on an AWS, GCP, or DigitalOcean box frequently returns 403 Forbidden no matter how correct your headers are. That is not a bug in your code. It is Reddit refusing the source IP. The next section shows what developers actually hit, and the 403 section covers the fix.


The Developer Reality: 403s and Result Caps

Before the next code path, it helps to name the two walls developers hit most, because they shape which option you should pick:

  • The 403 wall. A search call that reads correctly still returns 403 Forbidden. Usually a host, header, or IP problem, not a logic bug.
  • The result-cap wall. A query that should return thousands of posts quietly stops at a few hundred or at exactly 1,000. That is Reddit's pagination ceiling, not your loop.

Two real threads capture both. First, the 403 on a search call that looks correct:

r/redditdev·u/[deleted]

Forbidden API call - Search within subreddit

I'm making an API call in python to the Reddit API and receiving a response of: &#x200B; {'message': 'Forbidden', 'error': 403} &#x200B; I get the JSON I am looking for as expected when I visit this URL in the…

33
Open on Reddit

Second, the discovery that PRAW search quietly stops returning results well before you expect it to. One developer reported getting "around 241 results which is fine," then realizing the query was not exhaustive at all:

r/redditdev·u/Informal_Flatworm_78

PRAW is not fetching all the submissions when search for multiple keywords separated by OR

when I am using below code to fetch posts/submissions containing "420 community" in tittle, I am getting somewhat around 241 results which is fine. I am again searching for posts containing "cannabinoid medicine" in…

23
Open on Reddit

Both are real limits, not user error. The 403 is usually an IP or User-Agent problem. The truncated result set is Reddit's hard pagination ceiling. Understanding both is what separates a search script that works in a demo from one that holds up in production.


Searching Reddit with PRAW

If you already use PRAW, the search method is subreddit.search(). Search all of Reddit by targeting the synthetic all subreddit:

import praw

reddit = praw.Reddit(
    client_id="YOUR_CLIENT_ID",
    client_secret="YOUR_CLIENT_SECRET",
    user_agent="reddit-search-tutorial/1.0 (by u/your_username)",
)

results = reddit.subreddit("all").search(
    "machine learning",
    sort="relevance",
    time_filter="week",
    limit=25,
)

for submission in results:
    print(f"r/{submission.subreddit} | {submission.score} | {submission.title}")

The exact method signature, straight from the PRAW documentation, is:

search(query, *, sort="relevance", syntax="lucene", time_filter="all", **generator_kwargs)

Valid values, per praw.readthedocs.io: sort accepts relevance, hot, top, new, comments. time_filter accepts all, day, hour, month, week, year. syntax accepts cloudsearch, lucene, plain. PRAW handles the OAuth flow and rate-limit backoff for you, which is its main advantage over raw requests. Its main constraint is the same 1,000-result ceiling the native endpoint carries, because PRAW is calling that same endpoint underneath. If you are weighing PRAW against a plain REST approach, the REST vs PRAW comparison and the PRAW vs managed REST breakdown go deeper.


Start building with RedditAPI

Reads $0.002, votes $0.005, writes $0.012, DMs $0.025. $0.50 free credits.

The Managed REST Path: One Authenticated GET

The managed option trades the IP and OAuth machinery for a bearer token and a flat per-call price. It is the path to reach for when search has to run from a server, run concurrently, or never get 403-blocked. Here is the exact call, live-tested against the production endpoint:

import os
import requests

API_KEY = os.environ["REDDITAPI_KEY"]
BASE = "https://api.redditapis.com"

resp = requests.get(
    f"{BASE}/api/reddit/search",
    params={"q": "machine learning", "sort": "top", "t": "week", "limit": 25},
    headers={"Authorization": f"Bearer {API_KEY}"},
    timeout=60,
)
resp.raise_for_status()
data = resp.json()

for post in data["posts"]:
    print(f"r/{post['subreddit']} | {post['upvotes']:>5} | {post['title']}")

print("next page cursor:", data.get("after"))

The response shape is a top-level posts array plus an after cursor for pagination. Each post object carries id, name, title, author, permalink, url, text, subreddit, upvotes, comments, upvote_ratio, over_18, stickied, locked, is_self, created_utc, and more. Note the field names: the managed path uses upvotes and comments rather than the native score and num_comments, so map them once in your code.

Side by side JSON response shape: native search.json nested listing with score and num_comments versus managed posts array with upvotes and comments

The flatter shape is the practical difference at integration time. The native endpoint nests every post under data.children[i].data, which means a lot of bracket-walking before you reach a title. The managed path hands you a plain posts list where each element is the post object directly, so iterating is for post in data["posts"] with no envelope to peel. If you are normalizing both into one internal model, the only field rename you need is score to upvotes and num_comments to comments. Everything else (title, subreddit, permalink, created_utc) lines up.

To scope to a single community, pass subreddit instead of fiddling with restrict_sr:

import os
import requests

API_KEY = os.environ["REDDITAPI_KEY"]
BASE = "https://api.redditapis.com"

resp = requests.get(
    f"{BASE}/api/reddit/search",
    params={"q": "fastapi", "subreddit": "Python", "limit": 25},
    headers={"Authorization": f"Bearer {API_KEY}"},
    timeout=60,
)
posts = resp.json()["posts"]
for p in posts:
    print(f"r/{p['subreddit']} | {p['title']}")

Reddit search request flow: keyword to one GET call to a scored posts array with an after cursor

Both calls above were run against https://api.redditapis.com/api/reddit/search with a live bearer token while writing this tutorial; both returned HTTP 200 with a populated posts array. The first $0.50 of credit is free at signup, so you can paste this and watch it return real posts without a card.


Native vs PRAW vs Managed vs PullPush

There is no single best answer; there is a best answer for your constraints. This grid lays out the four real options developers reach for.

Comparison grid: native search.json, PRAW search, managed REST API, and PullPush across auth, setup, rate limit, result cap, historical data, and cost

A quick read on each. Native /search.json is free and zero-setup, ideal for a notebook or a script on your own machine, but it caps at 60 req/min and gets IP-filtered from cloud hosts. PRAW (praw.readthedocs.io) is the right call if you are already in the PRAW ecosystem and want OAuth and backoff handled, accepting the 100 req/min and 1,000-result ceilings; it is published on PyPI and maintained on GitHub. PullPush (pullpush.io) is the free successor to the shut-down Pushshift project, useful for historical archive search but with no uptime guarantee. A managed REST API abstracts the IP pool, auth, and rate limiting for a per-call fee, which is the trade that makes sense once search is load-bearing in production. Reddit's own access terms are described in the official developer API docs and the Reddit Data API wiki.

For a sense of how teams are now wiring multi-source search into AI agents, this thread on an open-source skill that fans out across Reddit, X, YouTube, and Hacker News in parallel is a good snapshot of where the demand is heading:

Sharbel

Sharbel

@sharbel

Someone built an AI agent that searches Reddit, X, YouTube, HN, TikTok, Polymarket, and the web in parallel. Scores everything by real upvotes, real likes, and real money. Synthesizes it into one brief. In seconds. It's called /last30days. 28,700+ stars on GitHub. You type ht… Show more


The 1,000-Result Cap and What to Do About It

Here is the limit that surprises people most. Both the native endpoint and PRAW are bounded by Reddit's pagination: you can page through at most 1,000 posts per search query, 10 pages of 100 results each. Reddit does not expose an exhaustive full-text index, so "give me every post ever made about X" is not a request the search API can satisfy.

Reddit search result-cap stat: 1,000 max posts per query and 60 requests per minute, recent content covered but full history capped

For most use cases this is fine. Brand monitoring, trend tracking, and RAG pipelines care about recent posts, and the last 7 to 30 days fit comfortably inside 1,000 results for all but the highest-volume keywords. When you genuinely need the full archive, you have two routes. PullPush exposes /reddit/search/submission/ and /reddit/search/comment/ over community-archived data, free but unreliable. A managed search API maintains its own index and paginates past the native ceiling with a cursor. Pick based on whether reliability or zero cost matters more for your job. The historical pattern is documented across r/redditdev, where the old advice to "use Pushshift" still circulates even though Pushshift shut down in May 2023.

Timeline of Reddit historical search: Pushshift before 2023, Pushshift shutdown in May 2023, PullPush as successor, and indexed options in 2026

That timeline matters because so much of the search advice online predates it. A 2022 thread will confidently tell you to "just use Pushshift," and the answer was correct then and is dead now. When you read an older tutorial or StackOverflow answer about exhaustive Reddit search, check the date. Anything before mid-2023 assumes a free, complete index that no longer exists. The 2026 reality is narrower: native search for the last 1,000 results, PullPush for best-effort history, and a maintained index when you need history with a reliability guarantee. The Reddit data licensing context explains why the free-archive era ended and what replaced it.

Capturing comments alongside posts

When your workload needs the discussion under a matched post, not just the post itself, you can fetch comments by permalink rather than running a second search. The managed path exposes a comments endpoint that takes a permalink and returns the post plus its comment tree, which is cheaper and more complete than a comment search for a known thread:

import os
import requests

API_KEY = os.environ["REDDITAPI_KEY"]
BASE = "https://api.redditapis.com"

# first find posts by keyword
search = requests.get(
    f"{BASE}/api/reddit/search",
    params={"q": "fastapi background tasks", "subreddit": "Python", "limit": 5},
    headers={"Authorization": f"Bearer {API_KEY}"},
    timeout=60,
).json()

# then pull the comment tree for the top match
top = search["posts"][0]
comments = requests.get(
    f"{BASE}/api/reddit/comments",
    params={"permalink": top["permalink"]},
    headers={"Authorization": f"Bearer {API_KEY}"},
    timeout=60,
).json()

print(f"{top['title']} has a tree with keys: {list(comments.keys())}")

This two-step pattern (search for matches, then fetch the tree for the ones you care about) is the standard shape for research and monitoring jobs. It keeps your call volume proportional to the posts you actually act on instead of paging through comments you will discard.


Pagination: Walking Past the First Page

A single call returns one page. To collect more, loop on the after cursor. Here is the pattern on the managed path, which is the same idea on the native path with data.after:

import os
import requests

API_KEY = os.environ["REDDITAPI_KEY"]
BASE = "https://api.redditapis.com"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

def search_all(query, max_pages=5):
    collected = []
    after = None
    for _ in range(max_pages):
        params = {"q": query, "sort": "new", "limit": 50}
        if after:
            params["after"] = after
        resp = requests.get(f"{BASE}/api/reddit/search", params=params,
                             headers=HEADERS, timeout=60)
        resp.raise_for_status()
        data = resp.json()
        collected.extend(data["posts"])
        after = data.get("after")
        if not after:
            break
    return collected

posts = search_all("local llm", max_pages=2)
print(f"Collected {len(posts)} posts across pages")

Set max_pages deliberately. Remember the hard ceiling: 10 pages of 100 is the most Reddit will give you for any single query, so chasing more than max_pages=10 on one keyword wastes calls. If you need more coverage, vary the query (add a subreddit scope, narrow the t window, or split the keyword into variants) rather than paging deeper on the same one.


Why Search Returns 403, and How to Fix It

The 403 is the most-reported Reddit search problem, and it has three usual causes:

  1. Wrong host. Calling https://oauth.reddit.com/search without a valid bearer token returns 403. For public search use https://www.reddit.com/search.json, which does not require OAuth.
  2. Missing or banned User-Agent. Reddit rejects requests with no User-Agent or with a generic library default. Send a descriptive one: myapp/1.0 (by u/yourusername).
  3. Filtered IP. This is the 2026 reality. Reddit blocks many cloud and datacenter IP ranges, so a correct script on a server can still 403 while the identical script on your laptop works. There is no header that fixes a filtered IP.

Three causes of a Reddit search 403 with fixes: wrong host, missing User-Agent, and filtered datacenter IP

Causes 1 and 2 are easy. Cause 3 is the one that pushes teams off the native endpoint in production: you either run from residential IPs you maintain yourself, or you route through a managed API whose IP pool Reddit accepts. A quick way to diagnose which cause you have hit: run the exact same script from your laptop. If it works locally but 403s on the server, you have an IP filter (cause 3), not a code bug. If it 403s everywhere, check the host and the User-Agent first. For the broader 2026 access picture, see the Reddit API authentication and OAuth guide and the no-PRAW Python tutorial for the full read and write endpoint set. The competitor benchmark on scraping throughput and error rates shows how often that IP filter actually bites at volume.

Here is a short, practical walkthrough of scraping and searching Reddit posts and comments that complements the code above:


The cheapest Reddit API. Try it free.

Reads from $0.002 per call. $0.50 free credits. No credit card required.

Searching Comments, Not Just Posts

Post search covers most needs, but comment search comes up constantly for research and monitoring. This thread is a typical ask: find comments containing a keyword and trace back to their submission.

r/redditdev·u/yumere7833

Can I search for comments including specific keywords or string by using praw library?

I'm collecting some submissions for research. I want to search for comments that include some url or keywords, and obtain submission of that comment. I'm newbie for praw and reddit. Please let me know how should I do.…

33
Open on Reddit

On the native path, add type=comment to a subreddit search URL. Reddit's own comment search is shallower than post search and is mostly within-subreddit. For cross-Reddit historical comment search, PullPush's /reddit/search/comment/ endpoint or a managed index is the realistic route. If you only need the comments under a known post, fetch them by permalink rather than searching at all, which is cheaper and complete.


Keyword search is rarely the goal on its own; it is the first step in a larger job. Four patterns cover most of what gets built on top of it, and each one nudges your parameter choices in a specific direction.

Four use cases built on Reddit search: brand monitoring, RAG ingestion, trend tracking, and lead discovery with the parameters each one favors

Brand monitoring wants the freshest mentions, so it leans on sort=new and runs on a short schedule, alerting when a new match appears. RAG ingestion wants the highest-signal posts to ground an LLM answer, so it favors sort=top with a t=year window and pulls the post text for embedding. Trend tracking repeats the same query across rolling t=week windows and charts upvote volume over time. Lead discovery scopes a phrase to relevant subreddits with the subreddit parameter and watches for people asking for a solution. The same endpoint serves all four; the parameters are what specialize it. If you are wiring search into an agent or a pipeline, the no-PRAW Python tutorial covers the surrounding read and write calls you will likely pair with it.

Reddit Search in the Age of AI Overviews, 2026

The reason Reddit search demand keeps climbing in 2026 is that AI systems now treat Reddit as a primary source. Two shifts drive it:

  • AI Overviews cite Reddit. Google's AI Overviews surface Reddit threads, and several assistants cite Reddit directly when answering "what do people think about X."
  • Agents need a bridge. Each platform is walled off behind its own API and auth, so an agent that wants to read what people actually said has to query Reddit search itself rather than relying on the model's training memory.

This widely-shared thread frames the gap plainly:

Awais

Awais

@drawais_ai

Look at how broken modern search is. Google indexes editors. ChatGPT has a Reddit deal but no X or TikTok. Gemini has YouTube but no Reddit. Claude has none of them natively. Each platform is a walled garden with its own auth, its own tokens, its own API. So the AI agent reading… Show more

The practical takeaway for a developer is that Reddit search is no longer just a way to find threads for a human to read. It is increasingly a data feed into an AI system: a retrieval step that grounds a model in real community discussion rather than a model's training memory. That use case rewards two things, reliability and freshness, which is exactly where the native endpoint's IP filtering and 1,000-result cap start to chafe. If your search is feeding a model that users see, an intermittent 403 is not a logging nuisance; it is a hole in the answer. This is the structural reason production search workloads drift toward a managed index even when the hobby path technically works. For how this connects to Reddit's broader data strategy, see the usage-based AI data licensing breakdown.

How Much Reddit Search Costs at Volume

Cost is the last input to the decision, and it is simpler than it looks because the three paths price on different axes:

  • Native /search.json: free in dollars, but you pay in an IP pool to dodge filtering and in maintenance time.
  • PRAW free tier: free in dollars, but you pay in OAuth app review and rate-limit backoff you maintain.
  • Managed REST: no infrastructure work; priced per call at $0.002 per GET instead.

Native and PRAW are free in dollars but cost you engineering time and infrastructure. A managed path is free of that work and prices per call instead.

Reddit search cost table at volume: brand monitor, RAG refresh, and trend tracker monthly call counts and estimated managed cost versus free native and PRAW tiers

The numbers above are estimates at $0.002 per GET call; model your own with the cost calculator once you know your real query volume. The honest framing: at low volume, free wins, because the per-call cost of a managed path is real and the engineering tax of the native path is small when you are running a few hundred searches a day from your laptop. At production volume, the math flips. Maintaining a residential IP pool and OAuth plumbing for tens of thousands of monthly searches costs more in engineering time than the per-call fee, and a 403 that breaks a customer-facing feature costs more than both. Decide on your actual numbers, not on the sticker price. The pricing page and the rate-limits reference have the per-operation detail.

Choosing Your Path

To close the loop, map your situation to a path. Beginners on r/learnpython asking how to "search for specific keywords in ALL of reddit" want the native /search.json snippet from this tutorial: it is free and runs in minutes. Teams already on PRAW should use subreddit.search() and live within the 1,000-result cap. Anyone running search from a server, at concurrency, or where a 403 is unacceptable should move to a managed REST endpoint.

Decision tree for picking a Reddit search path: hobby native search, existing PRAW user, or production managed REST API

The data is identical across all three; what differs is who runs the auth, the IP pool, and the rate limiting. For native and PRAW, that is you. For the managed path, it is the provider, for $0.002 per call with $0.50 free at signup. Model your own volume against the rate-limits reference before you decide.


Verdict

For a notebook or a hobby project, the native /search.json endpoint is the right answer: free, no OAuth, and you can paste the snippet above and have results in two minutes from your own machine. The moment search has to run from a server, page past the first 1,000 results reliably, or survive Reddit's IP filtering, the native path turns into IP-pool and backoff plumbing you have to maintain. PRAW handles the OAuth and backoff but inherits the same result ceiling. A managed REST endpoint like https://api.redditapis.com/api/reddit/search returns the exact same posts through a clean pool for a flat per-call price, which is the trade most production teams end up making. Start free, measure your real volume, and upgrade only when the native limits actually bite. Grab a key and run the code.

Frequently asked questions.

Reddit exposes a public search endpoint at `https://www.reddit.com/search.json`. Pass your query with the `q` parameter, order results with `sort` (relevance, hot, top, new, comments), and set a time window with `t` (hour, day, week, month, year, all). Add `restrict_sr=1` on a subreddit URL to limit results to that community, or search all of Reddit by omitting it. The endpoint returns up to 100 results per call (`limit=100`) and uses `after` for pagination. No OAuth is required for public subreddits at up to 60 requests per minute. See the [parameter reference](/blogs/reddit-search-api-tutorial-2026#search-endpoint-parameters-the-reference) or [sign up for a free key](/signup).

You have three options. (1) Direct REST: GET `https://www.reddit.com/search.json?q=your+keyword&sort=new&limit=25` with the `requests` library and a descriptive User-Agent. (2) PRAW: call `reddit.subreddit('all').search('your keyword', sort='relevance', time_filter='week')`, which returns a generator of Submission objects. (3) Managed REST API: send one authenticated GET to `https://api.redditapis.com/api/reddit/search` and get a `posts` array plus an `after` cursor back. Each path returns the same underlying data with different setup and rate-limit tradeoffs. The full read and write endpoint set is in the [no-PRAW Python tutorial](/blogs/reddit-api-python-tutorial).

A 403 on search almost always means one of three things: (1) you are calling the OAuth host `oauth.reddit.com` without a valid bearer token, (2) your User-Agent header is missing or matches a blocked bot pattern, or (3) you are querying a private or quarantined subreddit. For public subreddits use `https://www.reddit.com/search.json` (not the OAuth host) and set a descriptive User-Agent like `myapp/1.0 (by u/yourusername)`. Reddit also blocks many datacenter and cloud IP ranges outright, so a 403 can mean your server IP is filtered even when the code is correct. See the [authentication and OAuth guide](/blogs/reddit-api-authentication-oauth-2026); a managed REST API routes through a clean pool and sidesteps this entirely.

Yes, for public subreddits. `https://www.reddit.com/search.json` and `https://www.reddit.com/r/<sub>/search.json` work unauthenticated at up to 60 requests per minute, as long as you send a real User-Agent and call from an IP Reddit has not filtered. OAuth is only needed to search private subreddits or to lift the rate ceiling to 100 requests per minute. The catch in 2026 is the IP filter: unauthenticated calls from cloud or datacenter hosts frequently return 403 regardless of headers. The [REST vs PRAW comparison](/blogs/reddit-data-api-rest-vs-praw-2026) walks through the access tradeoffs.

PRAW's `.search()` is bounded by Reddit's own pagination: you can retrieve up to 1,000 posts per query (10 pages of 100 results). Reddit's API does not expose an exhaustive full-text index, so you cannot pull every post ever made about a keyword. For recent content (the last 7 to 30 days) PRAW search is fine. For full historical coverage you need an indexed source like PullPush (free, no SLA) or a managed Reddit search API that maintains its own index with higher caps. See the [result-cap section](/blogs/reddit-search-api-tutorial-2026#the-1000-result-cap-and-what-to-do-about-it).

Pushshift was a community-built Reddit search index shut down in May 2023 after Reddit's API pricing changes. PullPush (pullpush.io) is its community-maintained successor, built on Pushshift's archived data. It exposes `/reddit/search/submission/` for posts and `/reddit/search/comment/` for comments, with `q`, `subreddit`, `after`, `before`, and `size` parameters. PullPush is free but has no uptime guarantee. For production search where reliability matters, a managed REST API is the more dependable option. See the [PRAW vs managed REST breakdown](/blogs/praw-vs-redditapis-rest-2026).

Unauthenticated calls to `reddit.com/search.json` are capped at 60 requests per minute. OAuth apps on the free Developer tier get 100 requests per minute. Reddit's commercial Data API tier, introduced in 2023, negotiates higher limits directly. PRAW reads the rate-limit headers and backs off automatically when you approach the ceiling. If you need to run search concurrently past 100 req/min, a managed API with its own infrastructure pool abstracts the limit away. See [Reddit API rate limits in 2026](/blogs/reddit-api-rate-limits-2026).

Yes. On the native path, add `type=comment` to a subreddit search URL (`/r/<sub>/search.json?q=keyword&type=comment`) or use PullPush's `/reddit/search/comment/` endpoint for historical comment search. Reddit's own comment search is shallower than post search and is restricted to within-subreddit queries in most cases. For cross-Reddit comment search at scale, an indexed third-party source is the practical route. This is a common research use case raised repeatedly in r/redditdev. The [no-PRAW Python tutorial](/blogs/reddit-api-python-tutorial) covers fetching a comment tree by permalink when you already know the post.

Similar reads.

More guides on the Reddit API, scraping, pricing, and MCP servers.

Reddit API in Python tutorial cover -- no-PRAW, no-OAuth path using plain requests
Reddit APIPython

Reddit API in Python: The Complete No-PRAW Tutorial (2026)

Use the Reddit API in Python without PRAW in 2026. Plain HTTP with requests or httpx, one bearer token. Code examples for posts, comments, search, votes, and DMs from $0.002 per call.

RedditAPI·
PRAW vs Reddit REST API 2026: a developer choosing between PRAW and a third-party REST bearer-token path, redditapis.com is an independent third-party not affiliated with Reddit Inc
PRAWPRAW Alternative

PRAW vs Reddit REST API in 2026: When to Switch

A decision matrix for moving off PRAW to a REST plus bearer-token model. Feature parity, a field-name map, a one-hour migration plan, and the cost crossover point.

RedditAPI·
Reddit API rate limits guide covering current state, common traps, and Python mitigation code
Reddit APIRate Limits

Reddit API Rate Limits in 2026: Complete Guide to Budgets, 429 Errors, and Mitigation

Reddit's API rate limits shifted significantly in 2023 and have evolved since. Here's the complete 2026 state, the four patterns that blow through quotas, exponential backoff code, token rotation, async queuing for MCP servers, and how AI agent loops change the math.

RedditAPI·
Reddit API pricing vs Apify cover: side by side cost and throughput comparison for 2026, redditapis.com not affiliated with Reddit Inc
Reddit APIApify

Reddit API Pricing vs Apify: 2026 Cost and Throughput Guide

Reddit API pricing vs Apify scrapers in 2026, a side by side developer comparison covering per call cost, rate limits, compliance, and per workload guidance.

RedditAPI·
Editorial-surreal silhouette reaching upward through layered organic ribbons toward a node of light, glassmorphism title panel
Reddit APIReddit Vote API

Reddit Vote API: Upvote and Downvote a Post Programmatically (2026)

How to call POST /api/reddit/vote in 2026: auth via login, thing_id format (t3_ for posts, t1_ for comments), direction up/down/none, error handling, and how the no-OAuth REST path differs from PRAW.

RedditAPI·
Reddit Data API 2026 cover: surreal editorial illustration with magnifying glass over crystalline data structures in orange and deep blue, redditapis.com not affiliated with Reddit Inc
Reddit APIReddit Data API

Reddit Data API in 2026: REST Endpoints, No PRAW, No OAuth

Pull Reddit posts at $0.002 per call with a third-party REST API. Bearer token, no PRAW, no OAuth flow. Python examples, real endpoints, real pricing.

RedditAPI·
Reddit DM API tutorial showing a bearer-token REST request with code in curl, Python, and Node.js
Reddit APIReddit DM

How to Send a Reddit DM via REST API in 2026 (with Code)

Send Reddit DMs via REST API. Bearer token, JSON body, $0.025 per call. Working code in curl, Python, and Node.js. PRAW alternative for AI agents.

RedditAPI·
Comparison of static residential and ISP proxy providers for scraping Reddit data in 2026 with verified per-IP pricing, redditapis.com not affiliated with Reddit Inc
Residential ProxiesWeb Scraping

Best Residential Proxies for Reddit Scraping in 2026 (Verified Pricing) and When You Do Not Need One

Verified June 2026 per-IP pricing for static residential and ISP proxies (Decodo, Webshare, Bright Data, IPRoyal, Oxylabs and more), the fake-ISP risk, and the build-vs-buy math for scraping Reddit data.

RedditAPI·