catherinedevlin / ddl-generator Goto Github PK
View Code? Open in Web Editor NEWGuesses table DDL based on data
License: MIT License
Guesses table DDL based on data
License: MIT License
Most databases require column names to begin with a letter or $.
It is very useful for reversing to get structure of DB with all linked entities in some visual form (for example some tree format, suitable for d3.js
representation).
The command to get it is git clone git clone
. Only should have one.
I'm not sure if I am missing something, but when trying to include a schema in table_name - the period gets converted to underscore.
Is there another parameter for setting schema? (I could not find one)
INSERT statements are built raw from strings, making them vulnerable to SQL injection
Allow generating DDL from the output of another command. This should be easy to implement and give a ton of flexibility. "-" seems to be a standard to specify stdin. But it could also be implemented as a command line option.
Two ways I see this being used.
A command generates CSV that you want to generate DDL for.:
getdata -f csv | ddlcreate oracle -
Adding a header when the CSV file doesn't include one. We can add a header to the csv via:
cat header.csv data.csv | ddlcreate oracle -
Will allow
Include tox for automated testing
When a column name happens to be a database reserved keyword, generated INSERT statements need to be quote-enclosed to make legal SQL. For instance: column named "user" in PostgreSQL.
It seems that you have dateutils
in your setup.py
and requirements.txt
. That refers to this package which is not currently being used. It is possible this started out as a confusion about the python-dateutil
package name which was corrected by adding python-dateutil
but you forgot to remove dateutils
? You don't seem to be using it.
will allow pipelining
use tools like xlrd
Hi, this is so nice package but I found a bug.
Python 3.7.6
ddlgenerator 0.1.9
table = Table('my_data.csv', pk_name='index')
sql = table.sql('postgresql', inserts=False)
print(sql)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-c070dabded13> in <module>
----> 1 table = Table('db_data/shop_master.csv', pk_name='index')
2 sql = table.sql('postgresql', inserts=False)
3 print(sql)
4
5 # table_names = {'shop_master': 'shop_master', 'brand_master': 'brand_master', 'cateogry_master': 'cateogry_master', 'all_product_df': 'product_price'}
/opt/conda/lib/python3.7/site-packages/ddlgenerator/ddlgenerator.py in __init__(self, data, table_name, default_dialect, save_metadata_to, metadata_source, varying_length_text, uniques, pk_name, force_pk, data_size_cushion, _parent_table, _fk_field_name, reorder, loglevel, limit)
168 parent_name=self.table_name,
169 pk_name=pk_name,
--> 170 force_pk=force_pk)
171
172 self.default_dialect = default_dialect
/opt/conda/lib/python3.7/site-packages/ddlgenerator/reshape.py in unnest_children(data, parent_name, pk_name, force_pk)
284 field_names_used_by_children = defaultdict(set)
285 child_fk_names = {}
--> 286 parent = ParentTable(data, parent_name, pk_name=pk_name, force_pk=force_pk)
287 for row in parent:
288 try:
/opt/conda/lib/python3.7/site-packages/ddlgenerator/reshape.py in __init__(self, data, singular_name, pk_name, force_pk)
214 self.pk_name = pk_name
215 if force_pk or (self.pk_name and self.is_in_all_rows(self.pk_name)):
--> 216 self.assign_pk()
217 else:
218 self.pk = None
/opt/conda/lib/python3.7/site-packages/ddlgenerator/reshape.py in assign_pk(self)
254 raise Exception('Duplicate values in %s.%s, unsuitable primary key'
255 % (self.name, self.pk_name))
--> 256 self.use_this_pk(self.pk_name, key_type)
257 if suitability in ('absent', 'partial'):
258 for row in self:
/opt/conda/lib/python3.7/site-packages/ddlgenerator/reshape.py in use_this_pk(self, pk_name, key_type)
238 def use_this_pk(self, pk_name, key_type):
239 if key_type == int:
--> 240 self.pk = UniqueKey(pk_name, key_type, max([0, ] + all_values_for(self, pk_name)))
241 else:
242 self.pk = UniqueKey(pk_name, key_type)
TypeError: '>' not supported between instances of 'str' and 'int'
... allowing PG to serve data still left in the source file
If the user tries to use ddlgenerator with psycopg2, but it is not installed, the import trap at line 6 of console.py makes it look like the import of Table failed. It's hard to figure out that psycopg2 is what is actually missing.
mattjoiner@mattjoiners-LM-MBP:~$ ddlgenerator -i postgresql sample.json
Traceback (most recent call last):
File "/usr/local/bin/ddlgenerator", line 11, in
sys.exit(generate())
File "/usr/local/lib/python3.5/site-packages/ddlgenerator/console.py", line 112, in generate
generate_one(datafile, args, file=file)
File "/usr/local/lib/python3.5/site-packages/ddlgenerator/console.py", line 58, in generate_one
loglevel=args.log, limit=args.limit)
File "/usr/local/lib/python3.5/site-packages/ddlgenerator/ddlgenerator.py", line 170, in init
force_pk=force_pk)
File "/usr/local/lib/python3.5/site-packages/ddlgenerator/reshape.py", line 289, in unnest_children
for (key, val) in row.items():
RuntimeError: OrderedDict mutated during iteration
so a "cost" column with ["$6", "$3.22"] should become a "cost_$" column with [Decimal('6'), Decimal('3.22')]
likewise physical units like lb, km2, etc
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.