ishuan,Ishan Agarwal,github

a-b-testing-a-new-menu-launch

Analyze the results of the experiment to determine whether the menu changes should be applied to all stores. The predicted impact to profitability should be enough to justify the increased marketing budget: at least 18% increase in profit growth compared to the comparative period while compared to the control stores; otherwise known as incremental lift

android-apps-java

This is the repo which will contain the apps developed by me for Android using Java language.

android-apps-kotlin

This repository contains the Android App developed using Kotlin

create-reports-from-a-database

This is a project in which the reports have to be pull out of the 'northwind' database.

dbms

DBMS Term project

forecasting-video-game-demand

Forecast monthly sales data in order to synchronize supply with demand, aid in decision making that will help build a competitive infrastructure and measure company performance.

hackerrank-algorithm

hackerrank-java

Hands on Hackerrank for JAVA

hackerrank-python

hackerrank_30daychallenge

Practice problems on 30 Day Programming Challenge for HackRank in JAVA.

information-retrieval

Information retrieval (IR) is concerned with finding material (e.g., documents) of an unstructured nature (usually text) in response to an information need (e.g., a query) from large collections. One approach to identify relevant documents is to compute scores based on the matches between terms in the query and terms in the documents. For example, a document with words such as ball, team, score, championship is likely to be about sports. It is helpful to define a weight for each term in a document that can be meaningful for computing such a score. We describe below popular information retrieval metrics such as term frequency, inverse document frequency, and their product, term frequency-inverse document frequency (TF-IDF), that are used to define weights for terms. Term Frequency: Term frequency is the number of times a particular word t occurs in a document d. TF(t, d) = No. of times t appears in document d Since the importance of a word in a document does not necessarily scale linearly with the frequency of its appearance, a common modification is to instead use the logarithm of the raw term frequency. WF(t,d) = 1 + log10 (TF(t,d)) if TF(t,d) > 0, and 0 otherwise We will use this logarithmically scaled term frequency in what follows. Inverse Document Frequency: The inverse document frequency (IDF) is a measure of how common or rare a term is across all documents in the collection. It is the logarithmically scaled fraction of the documents that contain the word, and is obtained by taking the logarithm of the ratio of the total number of documents to the number of documents containing the term. IDF(t) = log10 (Total # of documents / # of documents containing term t) Under this IDF formula, terms appearing in all documents are assumed to be stopwords and subsequently assigned IDF=0. We will use the smoothed version of this formula as follows: IDF(t) = log10 (1 + Total # of documents / # of documents containing term t) Practically, smoothed IDF helps alleviating the out of vocabulary problem (OOV), where it is better to return to the user results rather than nothing even if his query matches every single document in the collection. TF-IDF: Term frequency–inverse document frequency (TF-IDF) is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus of documents. It is often used as a weighting factor in information retrieval and text mining. TF-IDF(t, d) = WF(t,d) * IDF(t)

ml_practice

page-rank-implementation

The goal of this programming assignment is to compute the PageRanks of an input set of hyperlinked Wikipedia documents using Hadoop MapReduce. The PageRank score of a web page serves as an indicator of the importance of the page. Many web search engines (e.g., Google) use PageRank scores in some form to rank user-submitted queries. The goals of this assignment are to: 1. Understand the PageRank algorithm and how it works in MapReduce. 2. Implement PageRank and execute it on a large corpus of data. 3. Examine the output from running PageRank on Simple English Wikipedia to measure the relative importance of pages in the corpus. To run your program on the full Simple English Wikipedia archive, you will need to run it on the dsba-hadoop cluster to which you have access.

ishuan Goto Github PK

Ishan Agarwal's Projects

Recommend Projects

Recommend Topics

Recommend Org