[python] How to open my files in data_folder with pandas using relative path?

I'm working with pandas and need to read some csv files, the structure is something like this:

folder/folder2/scripts_folder/script.py

folder/folder2/data_folder/data.csv

How can I open the data.csv file from the script in scripts_folder?

I've tried this:

absolute_path = os.path.abspath(os.path.dirname('data.csv'))

pandas.read_csv(absolute_path + '/data.csv')

I get this error:

File folder/folder2/data_folder/data.csv does not exist

This question is related to python pandas relative-path

The answer is


With python or pandas when you use read_csv or pd.read_csv, both of them look into current working directory, by default where the python process have started. So you need to use os module to chdir() and take it from there.

import pandas as pd 
import os
print(os.getcwd())
os.chdir("D:/01Coding/Python/data_sets/myowndata")
print(os.getcwd())
df = pd.read_csv('data.csv',nrows=10)
print(df.head())

This link here answers it. Reading file using relative path in python project

Basically using Path from pathlib you'll do the following in script.py

from pathlib import Path
path = Path(__file__).parent / "../data_folder/data.csv"
pd.read_csv(path)

Keeping things tidy with f-strings:

import os
import pandas as pd

data_files = '../data_folder/'
csv_name = 'data.csv'

pd.read_csv(f"{data_files}{csv_name}")

You could use the __file__ attribute:

import os
import pandas as pd
df = pd.read_csv(os.path.join(os.path.dirname(__file__), "../data_folder/data.csv"))

Try

import pandas as pd
pd.read_csv("../data_folder/data.csv")

Pandas will start looking from where your current python file is located. Therefore you can move from your current directory to where your data is located with '..' For example:

pd.read_csv('../../../data_folder/data.csv')

Will go 3 levels up and then into a data_folder (assuming it's there) Or

pd.read_csv('data_folder/data.csv')

assuming your data_folder is in the same directory as your .py file.


import pandas as pd
df = pd.read_csv('C:/data_folder/data.csv')

You can always point to your home directory using ~ then you can refer to your data folder.

import pandas as pd
df = pd.read_csv("~/mydata/data.csv")

For your case, it should be like this

import pandas as pd
df = pd.read_csv("~/folder/folder2/data_folder/data.csv")

You can also set your data directory as a prefix

import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
df = pd.read_csv(DATA_DIR+"data.csv")

You can take advantage of f-strings as @nikos-tavoularis said

import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
FILE_NAME = "data.csv"
df = pd.read_csv(f"{DATA_DIR}{FILE_NAME}")

I was also looking for the relative path version, this works OK. Note when run (Spyder 3.6) you will see (unicode error) 'unicodeescape' codec can't decode bytes at the closing triple quote. Remove the offending comment lines 14 and 15 and adjust the file names and location for your environment and check for indentation.

-- coding: utf-8 --

""" Created on Fri Jan 24 12:12:40 2020

Source: Read a .csv into pandas from F: drive on Windows 7

Demonstrates: Load a csv not in the CWD by specifying relative path - windows version

@author: Doug

From CWD C:\Users\Doug\.spyder-py3\Data Camp\pandas we will load file

C:/Users/Doug/.spyder-py3/Data Camp/Cleaning/g1803.csv

"""

import csv

trainData2 = []

with open(r'../Cleaning/g1803.csv', 'r') as train2Csv:

  trainReader2 = csv.reader(train2Csv, delimiter=',', quotechar='"')

  for row in trainReader2:

      trainData2.append(row)

print(trainData2)

# script.py
current_file = os.path.abspath(os.path.dirname(__file__)) #older/folder2/scripts_folder

#csv_filename
csv_filename = os.path.join(current_file, '../data_folder/data.csv')

For non-Windows users:

import pandas as pd
import os

os.chdir("../data_folder")
df = pd.read_csv("data.csv")

For Windows users:

import pandas as pd

df = pd.read_csv(r"C:\data_folder\data.csv")

The prefix r in location above saves time when giving the location to the pandas Dataframe.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to relative-path

webpack: Module not found: Error: Can't resolve (with relative path) Reading file using relative path in python project How to import a CSS file in a React Component How to open my files in data_folder with pandas using relative path? Relative path in HTML Failed to load resource: the server responded with a status of 404 (Not Found) How does Java resolve a relative path in new File()? relative path in BAT script How to define a relative path in java Relative imports for the billionth time