Python Scraping asp.net with requests “session based SearchQueue is empty”

I ended up looking into the Microsoft Docs and then found an article by Alex Ronquillo including the python session object which outlined some information I needed. I modified the code to the following:

import CountyFormDataList

import requests
import json

from scrapy import Selector

with requests.Session() as session:
    url = "http://property.franklincountyauditor.com/_web/search/CommonSearch.aspx?mode=OWNER"

    r = session.post(url)

    scriptManager = Selector(text=r.text).xpath('//*[@id="ScriptManager1_TSM"]/@value').get()
    viewState = Selector(text=r.text).xpath('//*[@id="__VIEWSTATE"]/@value').get()
    viewStateGenerator = Selector(text=r.text).xpath('//*[@id="__VIEWSTATEGENERATOR"]/@value').get()
    eventValidation = Selector(text=r.text).xpath('//*[@id="__EVENTVALIDATION"]/@value').get()

    payload = json.loads(
        "{" + CountyFormDataList.formDataList["CommonSearchASPX"]["search"]["ownerSearch"].format(
            scriptManager,
            viewState,
            viewStateGenerator,
            eventValidation,
            "SMITH"
        ) + "}"
    )
    cookies = CountyFormDataList.formDataList["CommonSearchASPX"]["cookies"]
    headers = CountyFormDataList.formDataList["CommonSearchASPX"]["headers"]

    r = session.post(url, data=payload, cookies=cookies, headers=headers)

    scriptManager = Selector(text=r.text).xpath('//*[@id="ScriptManager1_TSM"]/@value').get()
    viewState = Selector(text=r.text).xpath('//*[@id="__VIEWSTATE"]/@value').get()
    viewStateGenerator = Selector(text=r.text).xpath('//*[@id="__VIEWSTATEGENERATOR"]/@value').get()
    eventValidation = Selector(text=r.text).xpath('//*[@id="__EVENTVALIDATION"]/@value').get()

    payload = json.loads(
        "{" + CountyFormDataList.formDataList["CommonSearchASPX"]["result"]["resultJSON"].format(
            scriptManager,
            viewState,
            viewStateGenerator,
            eventValidation,
            "SMITH",
            "sIndex=0&idx=1"
        ) + "}"
    )

    r = session.post(url, data=payload, cookies=cookies, headers=headers)

f = open("ohioOutput.html", "w")
f.write(r.text)
f.close()

A simple adaptation such as this to retain the session allowed the problem to be fixed and the web page appeared to return the correct information. I don’t fully understand the intricacies behind the scenes but I’m going to continue working on it. Hope this helped somebody in a similar situation.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top