Code Monkey home page Code Monkey logo

rainintensity's Introduction

Estimation of annual maximum rainfall intensity

Plotted the Intensity-Duration-Frequency (IDF) curves by first estimating annual maximum intensity for 1-h, 2-h, 4-h, 8-h, 12-h and 24-h duration storms of return periods 2-, 5-, 10-, 50- and 100-years. A equation is also fitted to that plot.
Rainfall_Data.xls contains hourly rainfall data (in mm) for 99 calendar years.

Introduction

Intensity-Duration-Frequency (IDF) curves describe the relationship between rainfall intensity, rainfall duration, and return period. IDF curves are commonly used in the design of hydrologic, hydraulic, and water resource systems. IDF curves are obtained through frequency analysis of rainfall observations.

Methodology

First and foremost, task is to find the annual maximum rainfall intensity for specific durations (or the annual maximum rainfall depth over the specific durations). Durations used for the design applications are: 1-hr, 2-hr,4-hr, 8-hr, 12-hr, and 24-hr.
The development of IDF curves requires that a frequency analysis be performed for each set of annual maxima, one each associated with each rain duration. The basic objective of each frequency analysis is to determine the exceedance probability distribution function of rain intensity for each duration. For the frequency analysis fit a theoretical Extreme Value (EV) distribution (e.g., Gumbel Type I) to the observations and then use the theoretical distribution to estimate the rainfall events associated with given exceedance probabilities.

Result

Results and plotted graphs are in 170492.pdf file.

rainintensity's People

Contributors

pcgotan avatar

Stargazers

 avatar

Watchers

 avatar

rainintensity's Issues

Hi, dont use pandas for looping, try this.

Hi, I found your repo very helpful. I just have some minors suggestions.

First, the Rainfall Data you star with is unconventional, so I proposed to you a simple pandas based function to take a date, value to pass from long format to matrix format.

This assumes that the rainfall data has an smaller than hour granularity.

def long_to_matrix_format(csv_path):
    """
    Convert a long format CSV with 'datetime' and 'value' columns
    to a matrix format with 'Year', 'Day', and hourly columns.

    Parameters:
    - csv_path: str, path to the long format CSV

    Returns:
    - matrix_df: DataFrame in matrix format
    """

    # Read the CSV
    df = pl.read_csv(csv_path, try_parse_dates=True).to_pandas(())
    
# Convert the 'datetime' column to a pandas datetime format
    df["date"] = pd.to_datetime(df["date"])
    
# Set the DateTime column as the index
    df.set_index("date", inplace=True)
    df = df.resample("60min").sum()
    df["date"] = df.index.to_series()


    # Extract year, day of year, and hour
    df["Year"] = df["date"].dt.year
    df["Day"] = df["date"].dt.dayofyear
    df["Hour"] = (
        df["date"].dt.hour + 1
    )  # +1 to start hours from 1 to 24 instead of 0 to 23

    # Pivot the DataFrame to the desired matrix format
    matrix_df = df.pivot_table(
        index=["Year", "Day"], columns="Hour", values="value", aggfunc="sum"
    ).reset_index()

    # Rearrange columns to the desired order
    matrix_df = matrix_df[["Year", "Day"] + [i for i in range(1, 25)]]
    matrix_df.columns = ["Year", "Day"] + [f"Hour {i}" for i in range(1, 25)]

    # Fill NaN values with 0
    matrix_df.fillna(0, inplace=True)

    matrix_df["Year"] = matrix_df["Year"] - matrix_df["Year"].min() + 1

    return matrix_df.to_numpy().astype(float).copy()#this is important to make use of the numba code.

#-----------------------------------------

Then, the for loop you use to get the data1 dataframe is very slow, because you are using pandas vectorized functions on a (i, j) basis. I have a minor suggestion for this. do it pure python and then compile it in numba.

@numba.njit('float64[:,::1](float64[:,::1])')
@cc.export('nb_get_idf_rolling_sum', 'float64[:,::1](float64[:,::1])')
def get_idf_rolling_sum(df_array):
    rows = np.unique(df_array[:, 0]).size
    data = np.zeros((rows, 24))

    for i in range(rows):
        # Filter by year and drop the 'Year' and 'Day' columns
        df1_array = df_array[df_array[:, 0] == i + 1][:, 2:].T

        # Loop through 0 to 23 to calculate the max rolling sum for each window size
        for j in range(24):
            max_rolling_sum = np.NINF

            # Loop through each column
            for col in range(df1_array.shape[1]):
                # Loop through each possible window in the current column
                for row in range(df1_array.shape[0] - j):
                    rolling_sum = np.sum(df1_array[row : row + j + 1, col])

                    # Update max rolling sum if this sum is greater
                    if rolling_sum > max_rolling_sum:
                        max_rolling_sum = rolling_sum

            # Store the max rolling sum in the data array
            data[i, j] = max_rolling_sum

    return data

The numba function replaces this part.
---> for i in range(99):
df1 = (((df.where(df['Year']==i+1)).dropna()).drop(['Year','Day'],axis=1)).T
for j in range(24):
data[i][j] = max((df1.rolling(j+1).sum()).max())

#-------------------
This a time test for a 10 millon rows of a csv file of the long format meaning:

date,value
1981-01-01 01:00:00, 2.5
1981-01-01 02:00:00, 2.5
1981-01-01 03:00:00, 2.5
1981-01-01 04:00:00, 2.5
.
.
.
nth 10 million

#convert long format to matrix format
t0 = 0.5562264919281006

#proces roliing sum
t1 = 0.024176836013793945

This two functions use polars, numpy, numba and pandas.

I would be nice if you turn this into a python package maybe adding a couple of features, there are not many python packages for IDF curves construction.

Have a nice day,
Marcelo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.