Most LangChain pipelines start the same way: get reliable content, then split, embed, and store it.
browser.city’s Request API is the fastest path for that first step: it renders pages and returns clean markdown in one call.
If a site needs interaction (auth, clicking, multi-step flows), use Sessions (Playwright) or Humanized REST (/v1/do/*) and only fall back to extraction when you have the state you need.
1) URL -> markdown (Request API)
request.ts
const res = await fetch("https://api.browser.city/v1/requests", { method: "POST", headers: { Authorization: `Bearer ${process.env.BROWSERCITY_API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ url: "https://example.com", markdown: true }),}).then((r) => r.json());console.log(res.content);import osimport requestsapi_key = os.environ["BROWSERCITY_API_KEY"]res = requests.post( "https://api.browser.city/v1/requests", headers={"Authorization": f"Bearer {api_key}"}, json={"url": "https://example.com", "markdown": True},).json()print(res["content"])using System.Net.Http.Headers;using System.Net.Http.Json;var apiKey = Environment.GetEnvironmentVariable("BROWSERCITY_API_KEY")!;var http = new HttpClient();http.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);var res = await http.PostAsJsonAsync( "https://api.browser.city/v1/requests", new { url = "https://example.com", markdown = true });Console.WriteLine(await res.Content.ReadAsStringAsync());import java.net.URI;import java.net.http.*;public class Request { public static void main(String[] args) throws Exception { var apiKey = System.getenv("BROWSERCITY_API_KEY"); var http = HttpClient.newHttpClient(); var body = "{\"url\":\"https://example.com\",\"markdown\":true}"; var req = HttpRequest.newBuilder() .uri(URI.create("https://api.browser.city/v1/requests")) .header("Authorization", "Bearer %s".formatted(apiKey)) .POST(HttpRequest.BodyPublishers.ofString(body)) .build(); var res = http.send(req, HttpResponse.BodyHandlers.ofString()); System.out.println(res.body()); }}
2) Create documents (LangChain + generic)
langchain.ts
import { Document } from "@langchain/core/documents";const doc = new Document({ pageContent: res.content, metadata: { source: res.url, contentType: res.contentType, status: res.status, },});from langchain_core.documents import Documentdoc = Document( page_content=res["content"], metadata={"source": res["url"], "content_type": res["contentType"], "status": res["status"]},)// LangChain is TS/Python; in C# keep the same shape (content + metadata).public record Document(string PageContent, Dictionary<string, object?> Metadata);var doc = new Document( res.content, new() { ["source"] = res.url, ["contentType"] = res.contentType, ["status"] = res.status, });// LangChain is TS/Python; in Java keep the same shape (content + metadata).import java.util.Map;public class Document { record BrowserCityDocument(String pageContent, Map<String, Object> metadata) {} record BrowserCityResponse(String content, String url, String contentType, int status) {} public static void main(String[] args) { var res = new BrowserCityResponse( "# Example Domain", "https://example.com", "text/markdown", 200); var doc = new BrowserCityDocument( res.content(), Map.of( "source", res.url(), "contentType", res.contentType(), "status", res.status())); System.out.println(doc.pageContent()); }}
From here:
- split markdown into chunks (character or token-based)
- embed and store in your vector DB
- run retrieval + generation on demand
What to use when
- Use Request API for 90% of ingestion (fast, simple, cheap).
- Use Sessions when you need real browser state and deterministic automation.
- Use Humanized REST when you want interactive steps but don’t want to run Playwright in your runtime.