Comments (4)
Works perfectly now, thank you so much!
With regard to the sessions issue: I'm guessing that's something specifically related to how this agency set up their server. I hadn't thought about this until just now, but this chart and others hosted on the same server will often reset themselves when switching between filters in a browser.
Thank you again for coming up with a solution for both issues and for doing it so quickly.
from tableau-scraping.
@jbadelson Hello, the website seems to be only available in the US. I've managed to run it in repl.it (based in US) and extract all metadata for analysis
After investigating, the problem comes from the filter index, but my guess is that table.schema[].ordinal
field seems to indicate the start filter index:
"schema": [{
"caption": "Region",
"collation": {
"f": 0,
"l": 4294967295
},
"column_class": 1,
"dataType": "s",
"family_name": "Extract",
"fieldType": "N",
"hidden": false,
"name": ["sqlproxy.10q6ok10kzhj6r13vx7j91uvn4mv", "Region"],
"ordinal": 1, <==================================================
"role": "d"
}],
I didn't have any example showing that this field was important until now. The 3 following setFilter
usecases have all ordinal
set to 0
:
https://replit.com/@bertrandmartel/TableauCovidNewHampshire
https://replit.com/@bertrandmartel/TableauCovidSouthCarolina
https://replit.com/@bertrandmartel/TableauFilter
I will implement the parsing of the ordinal
field in schema
object
I think it's more related to #5 because it deals with how filterJson
parsing is implemented. Indexing issues are major challenges in this library since the indexing is sparsed in different places. For example, filter
and select
have completly different indexing mecanism:
select
:- selection data is located in the
data
json data - data is located in the same location as the worksheet data except that each selectable column has
isAutoSelect
field set totrue
- data is located in
dataDictionary
field and indexing info in thepaneColumnsData
field
- selection data is located in the
filter
:- filter data is located in the
info
json data - data is located in a json called
filtersJson
and which has atuples
field mapping keys/values
- filter data is located in the
from tableau-scraping.
@jbadelson I've released v0.1.9 which fix the index issue. But there is still something amiss when working with this url. I don't know if it's related to the filters, but it seems it doesn't return data (KeyError: 'dataDictionary'
) if I perform more than X setFilter
in a row. For instance:
from tableauscraper import TableauScraper as TS
url = 'https://analytics.la.gov/t/LDH/views/covid19_hosp_vent_reg/Hosp_vent_c'
ts = TS()
ts.loads(url)
workbook = ts.getWorkbook()
ws = ts.getWorksheet('Hospitalization and Ventilator Usage')
regions = next(iter([
t["values"]
for t in ws.getFilters()
if t["column"] == "Region"
]))
print(regions)
for region in regions:
print(region)
wb = ws.setFilter('Region', region)
ws = wb.getWorksheet('Hospitalization and Ventilator Usage')
print(ws.data)
It fails at 5 - Southwest
(or sometimes 8 - Monroe
) but if I restart the session (with a new TS object) I get the result for 5 - Southwest
so the filter query seems correct.
Something is wrong with the session and I think it's related to how the dataDictionary
is managed between calls (maybe tableau server specific). I still need to investigate but you can workaround by re-instantiating the TS object for the moment if you need to iterate all the regions
This seems particularly unusual since https://replit.com/@bertrandmartel/TableauCovidNewHampshire#main.py for NewHampshire works well with more than 10 setFilter
in a row
from tableau-scraping.
@jbadelson Everything is fixed in v0.1.10. The following get all regions:
from tableauscraper import TableauScraper as TS
url = 'https://analytics.la.gov/t/LDH/views/covid19_hosp_vent_reg/Hosp_vent_c'
ts = TS()
ts.loads(url)
workbook = ts.getWorkbook()
ws = ts.getWorksheet('Hospitalization and Ventilator Usage')
regions = next(iter([
t["values"]
for t in ws.getFilters()
if t["column"] == "Region"
]))
print(regions)
for region in regions:
print(region)
wb = ws.setFilter('Region', region)
regionWs = wb.getWorksheet('Hospitalization and Ventilator Usage')
print(regionWs.data)
repl.it: https://replit.com/@bertrandmartel/TableauCovid19Louisiana
Note that I've added a warning in case the response doesn't return any data. In this case, it's normal behaviour but most of the time, it would mean that something is wrong in the input or that there is a bug
from tableau-scraping.
Related Issues (20)
- Issue selecting Parameter in Story Point HOT 1
- getTupleIds fail to run when `presModel` is None
- Can't set a Parameter if not listed in getParameters() HOT 2
- Can't set a filter to a value thats not in the defined list for that filter HOT 2
- add missing method for command `select-region-no-return-server` HOT 1
- Add getFilterItems() method
- Scraping tableau data based on data filtered with dropdown boxes in non existent worksheet columns
- Adding range filters
- soup.find fails to find Tableau data HOT 3
- Scraping from a private-access dashboard
- scraping workbook, NOT worksheet with selectables HOT 6
- Attempting to fetch data from sheet but can't get worksheets using getWorksheets(), most requests are returning empty arrays.
- Zones don't always update when there are story points
- getFilters() only returns the first 200 values
- How to scrape pagination data
- Radio Button filtering does not work
- Warning: mixed data types
- Add support for the 'Choose a format to download' > 'Data' option (alongside the crosstab option)
- Issue w/ filtering and parameters/selectable items HOT 1
- Setting a filter produces `WARNING - no data dictionary present in response` and causes `getCsvData` to raise `TypeError`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tableau-scraping.