cohere-ai / quick-start-connectors Goto Github PK

This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and businesses to perform seamless retrieval-augmented generation (RAG) on their own data.

Home Page: https://docs.cohere.com/docs/connectors

License: MIT License

Dockerfile 0.39% Python 99.44% Perl 0.10% Shell 0.07%

connectors rag generative-ai llm

quick-start-connectors's People

Contributors

Stargazers

Watchers

Forkers

lokeshjonnakuti maiquangtuan farisology touristshaun chiefofstaff-ai rkp64 tomchapin shukawam ofermend srini047 dcarpintero marclove intellibridgeaidev amoghparab1805 healx jsess79

quick-start-connectors's Issues

[Confluence] HTML Strip

Which connector is affected?

confluence

What would you like to see improved?

This is both questions and suggestions:

Is cohere cleaning the HTML when sending to the API ?
If not, do you think removing things like styles from the HTML is advisable in the connector?
If HTML help with context it we may keep/simplify the tags, and remove styles, right?
The UI is showing the HTML , that should probably be removed

Thanks!

Additional information

No response

Sending additional parameters

Which connector is affected?

All sources that support filtering.

What would you like to see improved?

How possible is to send additional parameters for metadata filtering?

response = co.chat(  
	message="What is the chemical formula for glucose?",  
	connectors=[{"id": "my-connector", "params": {"some_field": "some_value"} }]  
)

The only way I can think of now is passing parameters on creation time:

created_connector = co.create_connector(
            name="Example connector",
            url="https://connector-example.com/search?some_field=some_value",
        )

But that's not very flexible.

Do you think calling the connector API directly with the filters, and then sending the results to the Cohere documents endpoint would do the trick?

curl --request POST  
    --url 'https://connector-example.com/search'
    --header 'Content-Type: application/json'  
    --data '{  
    "query": "How do I expense a meal?" ,
    "some_field": "some_value"
  }'

And then

            response = co.chat(
                message=message,
                documents=documents,
                conversation_id=self.conversation_id,
                stream=True,
            )

Is there a simpler way to achieve this filtering?

Thanks!

Additional information

No response

Can't create asana connector in Cohere

Which connector is affected?

Asana

What is the issue?

I'm able to hit my connector URL with postman:

curl --location 'https://<my_url>/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ey.....' \
--data '{
    "query": "design"
  }'

Now I'm trying to create it on Cohere with no luck:

import dotenv from "dotenv";
dotenv.config();

import { CohereClient } from "cohere-ai";

const cohere = new CohereClient({
  token: process.env.COHERE_API_KEY,
});


(async () => {
  const connector = await cohere.connectors.create({
    name: "asana-connector",
    url: "https://<my_url>/search",
    description: "Asana connector",
    service_auth: {
      type: "bearer",
      token: process.env.ASANA_CONNECTOR_TOKEN,
    },
  });

  console.log(connector);
})();

Both API keys are in place, I tried hardcoding them.

This is the error:

BadRequestError: BadRequestError
Status code: 400
Body: {
  "message": "connector not reachable at https://<my_url>/search. error: request failed with status code 401"
}
  statusCode: 400,
  body: {
    message: 'connector not reachable at https://<my_url>/search. error: request failed with status code 401'
  }
}

Any ideas?

I'm following this documentation:

https://docs.cohere.com/reference/create-connector

Thanks

Additional information

No response

[Confluence] More flexible search

Which connector is affected?

Confluence

What would you like to see improved?

Currently Confluence API is doing an AND query meaning every word must appear in a document. This is bad because of the nature of the conversations the question will not show up in the documents.

Confluence API have some options with search:

https://confluence.atlassian.com/doc/confluence-search-syntax-158720.html

We could at least document about this options.

What I'm doing as a workaround is manually entering ORs in the queries.

Not ideal but better than nothing.

What's your thought on this? Maybe training some intent classifier to do more sophisticated queries?

Like:

Detected entities: Emily , Project
(What's Emily doing on this project?) OR (Emily Project)

Appreciate your thoughs @tianjing-li

Additional information

No response

Update template connector

Which connector is affected?

Flask template

What would you like to see improved?

Some of the logic and docs are outdated in the Flask connector, notably the lack of a provider.py or client.py. The README could also be improved

Additional information

No response

Improve CONNECTOR_API_KEY documentation in all READMEs

Which connector is affected?

All connectors

What would you like to see improved?

Currently it can be confusing or unclear what the CONNECTOR_API_KEY environment variable is used for.

We need to clarify:

That the user will need to create and manage their own keys to secure the connector
That the CONNECTOR_API_KEY is required for all connectors
That this key will be used to authenticate requests to the connector when calling the /search API

Additional information

No response

wrong credentials data loader in mongodb connector

Which connector is affected?

MongoDB

What is the issue?

dev/load_data.py has:

client = pymongo.MongoClient(
host=os.environ.get("MONGODB_HOST", "mongo"),
port=os.environ.get("MONGODB_PORT", 27017),
username=os.environ.get("MONGODB_ROOT_USERNAME", "root"),
password=os.environ.get("MONGODB_ROOT_PASSWORD", "example"),
)

instead it should be:
client = pymongo.MongoClient(
connection_string,
)

Additional information

No response

hackernews connector doesn't always scape quotes adequately

Which connector is affected?

Harckernews

What is the issue?

the text field from HackerNews API includes an html tag but not escaped which breaks the json

Additional information

No response

[Confluence] KeyError: 'content'

Which connector is affected?

confluence

What is the issue?

For some reason some non-page objects make it through the code and they don't have the content key so the connector crashes in that cases.

{'space': {'key': 'CFS', 'name': 'TEST', 'type': 'global', 'metadata': {}, 'status': 'current', '_expandable': {'operations': '', 'permissions': '', 'description': ''}, '_links': {'self': 'https://xxxx-xxx.atlassian.net/wiki/rest/api/space/CFS'}}, 'title': '@@@hl@@@TEST@@@endhl@@@ Financial Solutions', 'excerpt': '', 'url': '/spaces/CFS', 'resultGlobalContainer': {'title': 'TEST Financial Solutions', 'displayUrl': '/spaces/CFS'}, 'breadcrumbs': [], 'entityType': 'space', 'iconCssClass': 'aui-icon content-type-space', 'lastModified': '2024-02-05T14:04:51.000Z', 'friendlyLastModified': 'about 2 hours ago', 'score': 0.0}

I fixed it this way:

    async def _gather(self, pages, results):
        tasks = []
        for page in pages:
            # Added check for content to avoid errors
            if "content" not in page:
                continue
            # end of added check
            page_id = page["content"]["id"]
            tasks.append(self._fetch_page(page_id, results))
        return await asyncio.gather(*tasks)

But I'm not sure about how this will impact the connector:

Thanks

Additional information

No response

[Confluence] Chunked HTML

Which connector is affected?

Confluence

What is the issue?

The content is not cleaned before chunking which makes very hard to clean up in the UI.

This is specially important for citations.

There are pages with giant style sheets making it to the citations.

I understand HTML can be helpful for context and formatting, but maybe including a stripped version field would help?

Sorry if this is a expected behavior.

Additional information

No response

[Confluence] Use V2 API

Which connector is affected?

Confluence

What would you like to see improved?

Instead of the Python SDK (which is being migrated to V2 slowly) just use requests to call their v2 API directly for search and get page

Additional information

No response

[DOCS] Asana connector misleading docs.

Which connector is affected?

asana

What would you like to see improved?

Documentation

Additional information

The documentation is not clear. I'm following the steps and I'm seeing this error:

{
  "detail": "No authorization token provided",
  "status": 401,
  "title": "Unauthorized",
  "type": "about:blank"
}

I set my env vars:

ASANA_AUTH_TYPE=access_token
ASANA_ACCESS_TOKEN=2/12065118.....
ASANA_WORKSPACE_GID=12065.....
ASANA_CONNECTOR_API_KEY=eyJhbGc...

I found the API call do not include the bearer token :

curl --location 'http://localhost:5000/search' \
--header 'Content-Type: application/json' \
--data '{
    "query": "BBQ"
  }'

After manually putting the header I see an error with the workspace ID because going to admin.asana.com shows a wrong one.

I went to this URL to get the ID:
https://app.asana.com/api/1.0/workspaces

Final request:

curl --location 'http://localhost:5000/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer eyJhbGc........' \ <---- Missing from docs
--data '{
    "query": "BBQ"
  }'

I would suggest putting together all the env vars needed for each authentication method. Finding that workspace id was required was not straightforward.

After getting it right all worked nicely. Thanks for this efforts ! I would be glad to rework the docs and create a PR if you like.

cohere-ai / quick-start-connectors Goto Github PK

quick-start-connectors's People

Contributors

Stargazers

Watchers

Forkers

quick-start-connectors's Issues

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What is the issue?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What is the issue?

Additional information

Which connector is affected?

What is the issue?

Additional information

Which connector is affected?

What is the issue?

Additional information

Which connector is affected?

What is the issue?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Which connector is affected?

What would you like to see improved?

Additional information

Recommend Projects

Recommend Topics

Recommend Org