I’m struggling to get a .csv file from a Git-hub post using python [duplicate]

  Kiến thức lập trình

I’m trying to get a .csv file from this git post: link.
So far i tried screaping with:

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs4
url = 'https://github.com/hiring-lab/job_postings_tracker/blob/master/US/aggregate_job_postings_US.csv'
response = requests.request("GET", url)
soup = bs4(response.content, "html.parser")

But could not find the table inside the soup.
I also findout that i could simply use:

link = 'https://github.com/hiring-lab/job_postings_tracker/blob/master/US/aggregate_job_postings_US.csv'
dados = pd.read_csv(link, sep=',', index_col='date')

But also didn’t work. Output: ParserError: Error tokenizing data. C error: Expected 1 fields in line 40, saw 25

2

use Raw method

import pandas as pd


url = 'https://raw.githubusercontent.com/hiring-lab/job_postings_tracker/master/US/aggregate_job_postings_US.csv'

try:
    data = pd.read_csv(url, sep=',', index_col='date')
    print(data.head()) 
except Exception as e:
    print(f"An error occurred: {e}")

Output

           jobcountry  ...        variable
date                   ...                
2020-02-01         US  ...  total postings
2020-02-02         US  ...  total postings
2020-02-03         US  ...  total postings
2020-02-04         US  ...  total postings
2020-02-05         US  ...  total postings

[5 rows x 4 columns]

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website Kho Theme wordpress Kho Theme WP Theme WP

LEAVE A COMMENT