censortracker / censortracker_backend Goto Github PK
View Code? Open in Web Editor NEWThe simple backend for Censor Tracker
License: MIT License
The simple backend for Censor Tracker
License: MIT License
This part of the backend's logic must be as clear and transparent as it is possible.
Since we working on Censor Tracker in Github, but our CI/CD is Gitlab we need to configure our deployment process somehow. Maybe we can create a mirror of this repository on Gitlab or something.
@HellsYeah @msva What do you think?
There is no need to store user IP at all, even temporally. You could utilize https://github.com/hadiasghari/pyasn for offline transforming IP to AS number on-the-fly, and store AS number in db. It has the same benefits as your current solution (as I understand, you save full client ip addr to db, and every 3 hours convert ip addr to client hash, region and ISP info by http (this for sure is to be fixed) request to separate proprietary SaaS service, correct me if I wrong).
pyasn include scripts to download raw data from http://archive.routeviews.org. An alternative could be parsing http://thyme.apnic.net data. So strore ASN in db. To get info about ASN, pyasn include script to parse http://www.cidr-report.org/as2.0/autnums.html Also there is maxmind's db https://dev.maxmind.com/geoip/geoip2/geolite2-asn-csv-database/ and relevant python lib. Also you could whois (or event rdap with https://rdap.arin.net/registry/autnum/<ASN>
).
Anyway, having ASN is more useful in tracking censorship events, and this conversion could be done entirely offline and on-the-fly, protecting user privacy better.
[2021-07-27 03:59:00,053: INFO/MainProcess] Received task: Update IP data[620bbda5-4b20-4911-aa22-4143b3ee6684]
[2021-07-27 03:59:00,215: ERROR/ForkPoolWorker-2] Task Update IP data[620bbda5-4b20-4911-aa22-4143b3ee6684] raised unexpected: DataError('value too long for type character varying(64)\n')
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(64)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/lib/python3.9/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/app/server/apps/api/tasks.py", line 17, in run
call_command("update_ip_data")
File "/usr/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
return command.execute(*args, **defaults)
File "/usr/lib/python3.9/site-packages/django/core/management/base.py", line 369, in execute
output = self.handle(*args, **options)
File "/app/server/apps/api/management/commands/update_ip_data.py", line 21, in handle
update_ip_data()
File "/app/server/apps/api/management/commands/update_ip_data.py", line 98, in update_ip_data
case.save()
File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 745, in save
self.save_base(using=using, force_insert=force_insert,
File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 782, in save_base
updated = self._save_table(
File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 864, in _save_table
updated = self._do_update(base_qs, using, pk_val, values, update_fields,
File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 917, in _do_update
return filtered._update(values) > 0
File "/usr/lib/python3.9/site-packages/django/db/models/query.py", line 771, in _update
return query.get_compiler(self.db).execute_sql(CURSOR)
File "/usr/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1499, in execute_sql
cursor = super().execute_sql(result_type)
File "/usr/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1151, in execute_sql
cursor.execute(sql, params)
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 100, in execute
return super().execute(sql, params)
File "/usr/lib/python3.9/site-packages/sentry_sdk/integrations/django/__init__.py", line 469, in execute
return real_execute(self, sql, params)
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 68, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
File "/usr/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
django.db.utils.DataError: value too long for type character varying(64)
This causes duplications of notifications for the same domain.
We need field restriction_type
Create a CensorshipCase
model with m2o
link to Domain
. In CensorshipCase
, we will store specific information on locks and the client (hash, region, etc.), for which the lock was detected, and in Domain
only domains.
CensorshipCase
can have any other much appropriate name.
Example:
class Domain(models.Model):
domain = models.CharField(..., unique=True)
...
class CensorshipCase(models.Model):
domain = models.ForeignKey(Domain, related_name='domains', ...)
...
It's better to store provider or region instead of the user's IP
This is what it (/api/domains/
) returns right now:
[
{
"domain":"s5.slivup.ch"
},
{
"domain":"all-audio.pro"
},
{
"domain":"https://example.com"
},
]
It's better to return just a list of domains:
[
"s5.slivup.ch",
"all-audio.pro",
"https://example.com"
]
The API is not in use yet, so we can change it easily.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.