Code Monkey home page Code Monkey logo

ai-engineering's Introduction

Introduction

Large Language Model (LLM) refers to a class of AI models that are designed to understand and generate human-like text based on large amounts of training data. The training process for LLMs typically involves unsupervised learning, where the model learns to predict the next word in a sentence based on the preceding words. This process helps the model capture statistical patterns and learn the relationships between words and phrases in the training data.

ozkary OpenAI - LangChain

Announcement and Updates

  • Join this list to receive updates on new content about Data Engineering Sign up here
  • Follow us on Twitter

What can this repo help with?

The focus of this code repository is to cover the features of LLM and how they can be leveraged for a real use cases using the LangChain and OpenAI frameworks. The format of the code in this repo is implemented in a way that can enable developers to gradually learn how to use this technology for building Python and Web applications.

By using LangChain's modular abstractions, we can orchestrate conversational pipelines thus reducing the amount of code needed for each step. Using LangChain accelerates our development process.

Use Cases with AI

Developers can leverage AI for various use cases, including:

  • Code Generation: OpenAI models can generate code snippets or even complete functions based on provided prompts or user requirements. This can assist developers in automating repetitive coding tasks, exploring code possibilities, or providing code suggestions.

Coming Soon...

  • Virtual Assistants and Chat bots: OpenAI models can be used to develop conversational agents, virtual assistants, or chat bots that can understand and respond to user queries in natural language. This enables developers to build interactive applications and provide automated support to users.
  • Data Analysis and Predictive Modeling: With OpenAI models, developers can perform data analysis tasks, extract insights from large datasets, and develop predictive models. These models can be trained to understand patterns, make predictions, and assist with decision-making processes.
  • Language Translation: OpenAI models can be used for language translation tasks, allowing developers to build applications that can translate text between different languages.
  • Natural Language Processing (NLP): OpenAI models can perform a range of NLP tasks such as text generation, sentiment analysis, language translation, summarization, chatbot development, and more. Developers can integrate these capabilities into their applications to automate text-related tasks and enhance user experiences.

Prompt Engineering

Prompt engineering is the process of designing and optimizing prompts to better utilize LLMs. Well described prompts can help the AI models better understand the context and generate more accurate responses. It is also helpful to provide some labels or expected results as examples, as this help the AI models evaluate its responses and provide more accurate results.

Governance and Compliance

  • AI governance refers to the set of policies, processes, and frameworks that organizations establish to guide the development, deployment, and use of AI technologies. It involves defining ethical principles, standards, and best practices to ensure responsible and accountable AI practices within an organization.

  • AI compliance, on the other hand, refers to adherence to external regulations, laws, and industry standards governing the use of AI technologies. It involves ensuring that AI systems and processes comply with legal requirements, such as data protection regulations (e.g., GDPR), industry-specific guidelines, and ethical frameworks.

ai-engineering's People

Contributors

ozkary avatar

Stargazers

 avatar Sathish Kumar avatar Leo Dee avatar Carol avatar  avatar

Watchers

 avatar  avatar

ai-engineering's Issues

Create a React Login Component

As a web developer, I want to create a React component with TypeScript for a login form that uses JSDoc for documentation, hooks for state management, includes a "Remember This Device" checkbox, and follows best practices for React and TypeScript development so that the code is maintainable, reusable, and understandable for myself and other developers, aligning with industry standards.

Needs:

  • Component named "LoginComponent" with state management using hooks (useState)
  • Input fields:
    • ID: "email" (type="email") - Required email field (as username)
    • ID: "password" (type="password") - Required password field
  • Buttons:
    • ID: "loginButton" - "Login" button
    • ID: "cancelButton" - "Cancel" button
  • Checkbox:
    • ID: "rememberDevice" - "Remember This Device" checkbox

Requirements:

  • Page header: "Please Enter Email and Password to Login"
  • Email and password fields with validation to prevent empty submissions.
  • "Remember This Device" checkbox (state managed with a hook).
  • Basic structure using React and TypeScript for further development.
  • JSDoc format (/** */) for component documentation.

Documentation Tags:

  • @file: LoginComponent.tsx
  • @description: React component for user login with email and password.
  • @author: [Your Name]
  • @param {string} email The user's email.
  • @param {string} password, The user's password.
  • @returns {boolean} True if login is successful, false otherwise.

Additional Considerations:

  • Placeholder for handling form submission (e.g., sending login data to backend).
  • Placeholder for error handling and feedback for invalid input.
  • Placeholder for UI styling using a library or CSS.

Create a SQL Query to Get the Top Ten Visited Stations

As a database developer, I need to create a complex SQL query that utilizes CTEs (Common Table Expressions) and joins to extract and analyze data efficiently. This query should identify the top 10 most visited stations for each month of the year 2024. So that, I can gain insights into station usage patterns and optimize resource allocation across the network.

Data Model:

  • dim_station: Contains information about stations, including station_name (primary key).
  • dim_booth: Contains details about booths (not relevant for this query).
  • fact_commuter: Tracks commuter activity with the following fields:
    • created_dt (datetime) - Timestamp of entry/exit
    • entries (numeric) - Number of people entering the station
    • exits (numeric) - Number of people exiting the station
    • station_id (foreign key referencing dim_station.station_id)
    • booth_id (foreign key referencing dim_booth.booth_id) (not relevant for this query)

Requirements:

  • Utilize at least two CTEs to organize the query logic.
  • Perform a complex join between fact_commuter, dim_station tables to link entries/exits with station names.
  • Filter data for the year 2024 based on the created_dt field in the fact_commuter table.
  • Calculate the total number of visitors per station per month by summing entries and exits.
  • Employ window functions (e.g., RANK) to rank stations within each month based on total visitors (descending order).
  • Retrieve and present the following data for each month:
    • month (extracted from created_dt)
    • station_name (from dim_station)
    • total_visitors (combined entries and exits)
    • rank

Desired Outcome:

The generated SQL code should effectively leverage CTEs and joins to achieve the stated goals. The output should be a clear table displaying the month, station name, total visitors, and rank for the top 10 most visited stations in each month of 2024.

Transform the data frame by merging the date and time field into a single column

As a data scientist, I want to generate code using the following technologies, requirements, and specifications:

Technologies:

  • Python

Requirements:

  • Transform a data frame by consolidating the 'date' and 'time' columns into a date time field column named 'created'.

Specifications:

  • Create a function with the name 'transform_data'
  • Use pandas to perform the data transformation
  • Load the data from a parameter with the CSV file path
  • Save the resulting data frame to disk in Parquet format
  • Return true if successful or false if is not
  • For coding standards, use the guidelines outlined in the PEP 8 style guide for Python.
  • Create a unit test for the transform_data function to verify its correctness. The unit test should cover different scenarios and assert the expected behavior of the function.

Create the code generation flow with github and openai

Description

Create the code generation flow with github and openai. The issues with the user-story label should have enough description to allow the code generation to create a snippet with the requirements that are provided on the issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.