Comments (2)
You can navigate DocumentArray via da['@c']
to use chunk information, docs here
The following example can be run as-is with pip install -U docarray
and should answer your problem.
from typing import List
from docarray import Document, DocumentArray
def dummy_encode(sentences: List[str]):
return [[1, 2, 3]] * len(sentences)
c1 = Document(text='hello')
c2 = Document(text='world')
d1 = Document(text='hello, world blah blah!', chunks=[c1, c2])
d1.display()
c3 = Document(text='hallo')
c4 = Document(text='welt')
d2 = Document(text='hallo, welt ja ja!', chunks=[c3, c4])
d2.display()
da = DocumentArray([d1, d2])
da.summary()
# embed on the "root" document
da.embeddings = dummy_encode(da.texts)
print(da.embeddings)
# let's reset the root
da.embeddings = None
# embed on the "chunk" document
da['@c'].embeddings = dummy_encode(da['@c'].texts)
print(da.embeddings) # -> None, as we reset it, and we did not embed it again
print(da['@c'].embeddings) # there we go!
# you can also access each
print(c1.embedding)
print(c2.embedding)
<Document ('id', 'text', 'chunks') at 051c13a3cdb825644962bd214da539f8>
└─ chunks
├─ <Document ('id', 'parent_id', 'granularity', 'text') at f7a9989942f5c00cc3680e846dbce361>
└─ <Document ('id', 'parent_id', 'granularity', 'text') at ff102adb6538cce3020105ef8599b3ff>
<Document ('id', 'text', 'chunks') at 7f255c2481e33d9ab9c7f2eef8052963>
└─ chunks
├─ <Document ('id', 'parent_id', 'granularity', 'text') at d4bf332de54be178224ab2b8fb3a5d44>
└─ <Document ('id', 'parent_id', 'granularity', 'text') at f6691a41caa4986420d2ac01238759d8>
Documents Summary
Length 2
Homogenous Documents True
Has nested Documents in ('chunks',)
Common Attributes ('id', 'text', 'chunks')
Attributes Summary
Attribute Data type #Unique values Has empty value
────────────────────────────────────────────────────────────────
chunks ('ChunkArray',) 2 False
id ('str',) 2 False
text ('str',) 2 False
[[1, 2, 3], [1, 2, 3]]
None
[[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]
[1, 2, 3]
[1, 2, 3]
from docarray.
closed as no reply
from docarray.
Related Issues (20)
- Explore if we can speedup Milvus execution
- feat: filtering in HNSW
- TorchTensor cannot be deep copied. HOT 5
- 0.36 Nested Collections not stored in a separate Sub Index HOT 7
- feat: support update for inmemory index
- feat: implement "update" for Milvus HOT 1
- Delete an Index of Weaviate with docarray API HOT 6
- Add Field Description and example to Predefined Documents HOT 9
- Release Notes v0.37.0
- feat: delete index HOT 2
- Crash when using non-class type field in Document with QdrantDocumentIndex HOT 1
- filter_docs fail to apply `and` when both parts of query refer to same field HOT 2
- chore: draft release note v0.37.1
- refactor: count number of documents using hnswlib HOT 11
- _pickle.PicklingError encountered when attempting to serialize `BaseDoc` type HOT 4
- Document torch.compile workaround
- Inconsistent `to_json()` return type HOT 1
- Request for support for MongoDB Atlas Vector Search as DocIndex HOT 5
- chore: bump to latest tensorflow version HOT 6
- Release Note HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docarray.