Code Monkey home page Code Monkey logo

tex's Introduction

Guilherme Bacellar / Th3 0bservator

Security Researcher | Machine Learning and Fraud Detection Specialist | Cyber Espionage | Malware Creator | Mobile Reverse Engineer | OSINT Lover | Facial Biometrics Breaker | Writer

1 Public OSINT Tool, 2 Malware Families Names, 1 Injection Tool used on C|EH and OSEP Certifications and 9+ Bypasses on Facial Biometric Liveness Detection

Connect with me:

th3_0bservator https://www.linkedin.com/in/guilherme-bacellar/ https://theobservator.net/feed/

th3_0bservator



TROPHY

Blogs posts

tex's People

Contributors

guibacellar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tex's Issues

Quality of Life - Automatic Database Maintenance

Today, Database Maintenance are only possible through cmd command, but, that require to stop all Telegram Explorer processes and can take a long time for hundred groups.

Create a new option to enable a automatic maintenance while run the "listen" command to keep DB health and smallest as possible.

Allow do specify the cycle time (hourly, every 2 hours, etc), retention period (ex: keep only 7 days, keep the last 60 days) and downloaded files retention separately by mime-type (like media final control). Note: Individual Media Retention Period CANNOT be greater the Messages Retention Period.

Example Config:

[DB.AutoMaintenance]
enabled=true
messages_retention_period_days=60
default_media_retention_period_days=7

[DB.AutoMaintenance.Media.image/gif]
retention_period_days=30

[DB.AutoMaintenance.Media.application/rar]
retention_period_days=30

Media Filter and Exportation

Add a new command to allow the user to export media (from type and/or name and/or other information) on reports OR export to a specific folder

install

Can someone help me how to install this on a windows pc? i+ve installed python but i cannot get this package to install. i tried anything i could, through cmd in admin mode, through python module, python 3.10 but i only get some error when i tray to activate the command:

Installing
Telegram Explorer is available through pip, so, just use pip install in order to fully install TeX.

pip install TelegramExplorer

Error: Module is Not Enabled...

2023-10-06 18:59:21,666 - INFO - [+] telegram_connection_manager.TelegramConnector
2023-10-06 18:59:21,962 - INFO - Authorizing on Telegram...
2023-10-06 18:59:22,292 - INFO - User Authorized on Telegram: True
2023-10-06 18:59:22,292 - INFO - [+] telegram_groups_scrapper.TelegramGroupScrapper
2023-10-06 18:59:22,296 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,296 - INFO - [+] telegram_groups_list.TelegramGroupList
2023-10-06 18:59:22,297 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,297 - INFO - [+] telegram_messages_scrapper.TelegramGroupMessageScrapper
2023-10-06 18:59:22,309 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,309 - INFO - [+] telegram_messages_listener.TelegramGroupMessageListener
2023-10-06 18:59:22,533 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,533 - INFO - [+] telegram_report_generator.telegram_report_sent_telegram.TelegramReportSentViaTelegram
2023-10-06 18:59:22,533 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,534 - INFO - [+] telegram_connection_manager.TelegramDisconnector
2023-10-06 18:59:22,538 - INFO - [+] telegram_report_generator.telegram_html_report_generator.TelegramReportGenerator
2023-10-06 18:59:22,563 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,563 - INFO - [+] telegram_report_generator.telegram_export_text_generator.TelegramExportTextGenerator
2023-10-06 18:59:22,563 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,563 - INFO - [+] telegram_report_generator.telegram_export_file_generator.TelegramExportFileGenerator
2023-10-06 18:59:22,564 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,564 - INFO - [+] telegram_stats_generator.TelegramStatsGenerator
2023-10-06 18:59:22,564 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,564 - INFO - [+] telegram_maintenance.telegram_purge_old_data.TelegramMaintenancePurgeOldData
2023-10-06 18:59:22,565 - INFO - Module is Not Enabled...
2023-10-06 18:59:22,565 - INFO - [*] Executing Termination:
2023-10-06 18:59:22,565 - INFO - [+] state_file_handler.SaveStateFileHandler

What should i do?

Updated messages

I follow groups where the admins often update a message with a new link or new information. How can I best a new entry for every new update? As far as I can see the code now will only store new messages and not updated messages.

DB Handling Improvements

Change DB Connection Timeout
Add Sharding to reduce DB Size and Report Stress on Message Ingestion

Keep Alive and Other Signals

Allow to configure some signals (KeepAlive, New Group, and others) to be sent via the Notification Engines.

Created from #47 issue request

Signals:
Keep-Alive > Notify Every T seconds that the listen-messages are running
New-Groups > Notify When a New Group appears on Database

Add More RegEx Expressions on Finders

Allow the Finder to use More than 1 regex at time

[FINDER.RULE.MessagesWithURL]
type=regex
regex1=aaa
regex2=bbb
regex3 =ccc
notifier=NOTIFIER.DISCORD.MY_HOOK_2

Created from #47 Issue

R&D - Image Metadata

Investigate if images metadata from downloaded assets can provide any useful information.

Bug: Incorrect Regex Handling on export_text Command

When using

export_text --config E:\TeX\config.ini --order_desc --limit_days 30 --regex (^\S+@\S+\.\S+$) --report_folder /folder/ --group_id *

Traceback (most recent call last):
File "C:\projects\TelegramMonitor\TEx_main_.py", line 22, in
sys.exit(TelegramMonitorRunner().main())
File "C:\projects\TelegramMonitor\TEx\runner.py", line 70, in main
self.__execute_sequence(args, data, self.config['PIPELINE']['pipeline_sequence'].split('\n'), 'Pipeline')
File "C:\projects\TelegramMonitor\TEx\runner.py", line 100, in __execute_sequence
loop.run_until_complete(
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete
2023-10-11 08:34:00,985 - INFO - Filtering
return future.result()
File "C:\projects\TelegramMonitor\TEx\runner.py", line 122, in __execute_pipeline_item
await module_instance.run(
File "C:\projects\TelegramMonitor\TEx\modules\telegram_report_generator\telegram_export_text_generator.py", line 75, in run
await self.__export_data(
File "C:\projects\TelegramMonitor\TEx\modules\telegram_report_generator\telegram_export_text_generator.py", line 122, in __export_data
filtered_messages: List[str] = self.filter_messages(messages=messages, filter_regexs=filter_regexs)
File "C:\projects\TelegramMonitor\TEx\modules\telegram_report_generator\telegram_export_text_generator.py", line 149, in filter_messages
compiled_regex = [re.compile(item, flags=re.IGNORECASE | re.MULTILINE) for item in filter_regexs]
File "C:\projects\TelegramMonitor\TEx\modules\telegram_report_generator\telegram_export_text_generator.py", line 149, in
compiled_regex = [re.compile(item, flags=re.IGNORECASE | re.MULTILINE) for item in filter_regexs]
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\re.py", line 252, in compile
return _compile(pattern, flags)
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\re.py", line 304, in _compile
p = sre_compile.compile(pattern, flags)
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\sre_parse.py", line 948, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\sre_parse.py", line 443, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
File "C:\Users\Th3 0bservator\AppData\Local\Programs\Python\Python38\lib\sre_parse.py", line 593, in _parse
raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range \w-. at position 2

Quality of Life request (not issue)

Good day, I love your telegram scrapper. its been very beneficial for my use cases! May I request these few updates if you have the time ?

  1. A keep alive ping sent to a discord webhook every x minutes so we know that the bot is still running

For example :
[keep-alive]
interval=300
notifier=NOTIFIER.DISCORD.MY_HOOK_1

  1. A way to put every search term into a single finder rule and notifier, because now its one rule for one notifier, something like this :

For example :
[FINDER.RULE.MessagesWithURL]
type=regex
regex1=aaa
regex2=bbb
regex3 =ccc
notifier=NOTIFIER.DISCORD.MY_HOOK_2

Thank you very much once again I love this app

OCR Support

Add a OCR Support to process all images (png, jpg, jpeg, etc) in order to extract all texts. Also, allow the Finder engines to process the OCR result.

How to use socks5 proxy

Errors:
raise ConnectionError('Connection to Telegram failed {} time(s)'.format(self._retries))
ConnectionError: Connection to Telegram failed 5 time(s)
In telegram_connection_manager.py, I add a proxy,
proxy = (socks.SOCKS5, 'IP address', port, username, password)
client = TelegramClient(
os.path.join(session_dir, config['CONFIGURATION']['phone_number']),
data['telegram_connection']['api_id'],
data['telegram_connection']['api_hash'],
catch_up=True,
device_model='TeX',
proxy=proxy
)
but it does not work, what should I do to make it available?

CI/CD

Setup the CI/CD Process to Build, Test, QA and Deploy at Pypi repo

Crash when Telegram Group has no Name

Application fails on --load-groups command when a Group has no Username

File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context self.dialect.do_execute( File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute cursor.execute(statement, parameters) sqlite3.IntegrityError: NOT NULL constraint failed: telegram_group.group_\tname

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/USER/[email protected]/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/USER/[email protected]/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/site-packages/TEx/main.py", line 22, in
sys.exit(TelegramMonitorRunner().main())
File "/usr/local/lib/python3.10/site-packages/TEx/runner.py", line 70, in main
self.__execute_sequence(args, data, self.config['PIPELINE']['pipeline_sequence'].split('\n'), 'Pipeline')
File "/usr/local/lib/python3.10/site-packages/TEx/runner.py", line 97, in __execute_sequence
loop.run_until_complete(
File "/usr/local/USER/[email protected]/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/TEx/modules/telegram_groups_scrapper.py", line 102, in run
TelegramGroupDatabaseManager.insert_or_update(values)
File "/usr/local/lib/python3.10/site-packages/TEx/database/telegram_group_database.py", line 66, in insert_or_update
DbManager.SESSIONS['data'].execute(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2262, in execute
return self._execute_internal(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2144, in _execute_internal
result: Result[Any] = compile_state_cls.orm_execute_statement(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/bulk_persistence.py", line 1276, in orm_execute_statement
result = conn.execute(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
return meth(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
return connection._execute_clauseelement(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
ret = self._execute_context(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
return self._exec_single_context(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2339, in handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1965, in exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) NOT NULL constraint failed: telegram_group.group
\tname
[SQL: INSERT INTO telegram_group (id, constructor_id, access_hash, group
\tname, title, fake, gigagroup, has_geo, restricted, scam, verified, participants_count, photo_id, photo_base64, photo_name, source) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (123, 123, '-123', None, 'GROUP NAME', 0, 0, 0, 0, 0, 0, 61, 123, '/9j/4AAQSkZJRgABAQEAeAB4AAD/4gHbSUNDX1BST0ZJTEUAAQEAAAHLAAAAAAJAAABtbnRyUkdCIFhZWiAAAAAAAAAAAAAAAABhY3NwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAA9tYAA ... (38226 characters truncated) ... AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//2Q==', 'image.jpg', '+12345678904')]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
zsh: segmentation fault sudo python3.10 -m TEx load_groups --config /\ts/memset/my_TEx_config.confi

[Update] My personal testing on EC2 instance with 6000+ search terms

I want to share that TEx runs properly on Amazon EC2 free instance. You can sign up and let it run for free(i think) for 12 months. Please check their website for details.

I am using this machine image
qwer

with this specification
qwer2

the image does not have native desktop GUI so follow this guide to install it:

https://shrihariharidas73.medium.com/how-to-setup-gui-desktop-with-ubuntu-on-aws-ec2-ea713d836a58

Subsequently you can pip install TEx as normal and run the scraper

However take note that due to the limitations of the EC2 instance specs, it might be difficult for you to transfer files if you intend to do so over their internet browser, but it can be prevented if you RDP and enable shared drives to copy paste your logs over. Alternatively you can do it using secure copy

Also, I am currently running with 5000+ search terms on the same machine. I had some issues with the new config format where you could use multiple regex for a single notifier, so i went back to the old way of using 1 regex for 1 notifier.

Also, have created a python script that helps to convert the wordlist into the configfile

Untitled

Its a very basic script so amend the variables yourself :D

https://github.com/0xEnders/wordlist_to_config/blob/main/wordlist_to_config.py

[Investigation] Auto Split Database

Based on request Feature Request: #56

Investigate if are possible, and how the impact on code and all commands, to auto-split the database based on Date (Daily maybe) and how manage dozens of small db's automatically.

Error reporting

Hello! When running the command "python3 -m TEx listen --config..." an error appears. Could you help me understand its essence?
I would be very grateful

image

image

Error reporting

. Consider execute "load_groups" command to perform a full group synchronization (Members and Group Cover Photo).
2023-10-06 16:39:59,248 - ERROR - Unhandled exception on __handler
Traceback (most recent call last):
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\telethon\client\updates.py", line 570, in _dispatch_update
await callback(event)
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx\modules\telegram_messages_listener.py", line 48, in __handler
await self.__ensure_group_exists(event=event)
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx\modules\telegram_messages_listener.py", line 120, in __ensure_group_exists
group_dict_data: Dict = TelethonChannelEntityMapper.to_database_dict(
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx\core\mapper\telethon_channel_mapper.py", line 19, in to_database_dict
'gigagroup': channel.gigagroup if channel.gigagroup else False,
AttributeError: 'User' object has no attribute 'gigagroup'
Traceback (most recent call last):
File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx_main
.py", line 22, in
sys.exit(TelegramMonitorRunner().main())
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx\runner.py", line 70, in main
self.__execute_sequence(args, data, self.config['PIPELINE']['pipeline_sequence'].split('\n'), 'Pipeline')
File "C:\Users\48242\AppData\Roaming\Python\Python310\site-packages\TEx\runner.py", line 101, in __execute_sequence
loop.run_until_complete(
File "C:\Program Files\Python310\lib\asyncio\base_events.py", line 628, in run_until_complete
self.run_forever()
File "C:\Program Files\Python310\lib\asyncio\windows_events.py", line 316, in run_forever
super().run_forever()
File "C:\Program Files\Python310\lib\asyncio\base_events.py", line 595, in run_forever
self._run_once()
File "C:\Program Files\Python310\lib\asyncio\base_events.py", line 1845, in _run_once
event_list = self._selector.select(timeout)
File "C:\Program Files\Python310\lib\asyncio\windows_events.py", line 434, in select
self._poll(timeout)
File "C:\Program Files\Python310\lib\asyncio\windows_events.py", line 783, in _poll
status = _overlapped.GetQueuedCompletionStatus(self._iocp, ms)

[Feature Request] Multiple features

Hi again! Thank you for the quick updates and implementation of features. I have tested the V0.3.0-dev build and everything works perfectly!

I have a few more QOL requests if you you think its useful

  1. Translation :
    Will it be possible to add a translation function to the listener? Understand that it might be slow if you pipe from telegram > translator > discord so it could be optional. Could be like how you made it for OCR, e.g
    [TRANSLATION]
    enabled=true

message
===TRANSLATION===
translated message

  1. Cut db file every day :

I am not sure if this is possible, but could there be an option to cut the .db file every day so we can run our own scripts to export or mail it somewhere else? for example something like
[EXPORT_DB]
enabled=true
time_to_cut_file=0000H GMT+0

  1. config_file for exporting of db to csv :
    An option to export the sqlite3 db to csv with our custom headers.
    For example now the telegram_message csv starts with id, group_id , media_id etc. Is it possible for us to use another config file to choose what to export? E.g,
    #export config file
    [EXPORT_CSV]
    header=group_id
    header=group_username
    header=title
    header=message

Thank you once again for your hard work!

Rule creation with YAML

Possibilidade de criar rules em formato yaml e compartilhar com a comunidade:

Senhas.yaml
CPF.yaml
Biofac.yaml

listen messages error

Hi all, when running the command to start "listen messages" the following errors are detected in the figure.
what could be the cause?
the file configuration was done correctly and the login and group update as well.
thanks for the support
error

[Errno 2] No such file or directory: '/usr/home/tex_data/'

Hi there,

I installed TEx through pip and everything was OK. When I try to input the command python3 -m TEx connect --config /usr/my_TEx_config.config I obtain this error message : os.mkdir(config['CONFIGURATION']['data_path'])
FileNotFoundError: [Errno 2] No such file or directory: '/usr/home/tex_data/'
I checked, the folder wasn't there so I created it. Now it's OK but I always have the same error message.
Any idea why ?
Thanks !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.