sypets / brofix Goto Github PK
View Code? Open in Web Editor NEWCheck for broken links, forked from TYPO3 system extension linkvalidator
License: Other
Check for broken links, forked from TYPO3 system extension linkvalidator
License: Other
We can already manage the "exclusions" records stored in tx_brofix_exclude_link_target with the list module but it's not optimal. We want to add a new tab with those records with few functionnalities:
We will provide a Pull Request for these new functions.
Describe the bug
If no allowed languages are defined for a BE user / group ([allowed_languages] is not set), only broken links in records of default language are shown for editors.
To Reproduce
Steps to reproduce the behavior:
Only broken links in elements of default language are displayed for non-admin users.
Expected behavior
The editor should also see broken links in the translated elements.
System (please complete the following information):
Fix bugs before adding bells and whistles
The extension should be configurable and extendable, but use defaults that will work well in most usecases.
It should be possible to start without having to configure and setup a lot. Unfortunately, this is not entirely possible, but we try to use good defaults wherever possible, use the settings from global configuration (where available) and generate defaults.
We adhere to TYPO3 core conventions as much as possible. This means we also enforce the TYPO3 coding guidelines, for example. Tools exist to check locally. For PR, GitHub actions run and perform the checks.
See CONTRIBUTING.md in this repo for more information, also see commands in composer.json and .github/workflows/ci.yaml.
For new features or bugfixes, tests should be added.
It should be fun to use or at least a good experience. This means - wherever possible:
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
More control for the user:
Also:
Good resource: https://www.mobilespoon.net/2019/11/design-ui-tables-20-rules-guide.html
image 1: duplicate information in left columns if more than one broken link in record / on page
There are now several action buttons. Where should they be placed?
On the one hand, the context specific looks visually distracting. The buttons are visually visisble as buttons but there is too much text and it takes some getting used to what is just plain information (e.g. language icon, content type icon) and what is an action. So it would be better to group this in an action column. This also corresponds to other views in the core, e.g. the redirects module:
On the other hand, there is a distinction between 2 parts of the link: the left part corresponds to where the link is (the content element on the page), the right part corresponds to the link target (the URL). It is clearer to put the action buttons where they belong.
I very much like the German vacination dashboard: https://impfdashboard.de/
Some principles / ideas are explained on Twitter by Moritz Stefaner, e.g.:
Describe the bug
Before the CI was run twice if a pull request is created from a feature branch within the same repository (not a fork). The changes follow the recommendation given by @isaac in When Should CI Run? chapter of his blog post: CI with GitHub Actions for Ember Apps
as in
https://github.com/jelhan/create-github-actions-setup-for-ember-addon/pull/7
https://github.com/jelhan/create-github-actions-setup-for-ember-addon/issues/5
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Run tests only once
Additional context
Could also be resolved by creating a fork and pushing directly to master
URL in tx_brofix* Table is not the same URL in RTE with the result that the link is not marked as broken in RTE.
todo: has to be tested again with latest version.
Link in RTE:
https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1535103367&Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf
Link in tx_brofix* Table:
https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1535103367&Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf
Snippet in RTE:
<p>30) Kreutz, G. (2014). Does singing facilitate social bonding? <em><a href="https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1535103367&Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf">Music and Medicine</a>, 6</em>(2), 51-60.</p>
Describe the bug
Depth=0 is interpreted as unset option and the default is used.
To Reproduce
vendor/bin/typo3 brofix:checklinks -p 1 -d 0 --dry-run
The default depth (999) or the depth set via Page TSconfig is used. This is not correct.
Expected behavior
A depth of 0 should be used.
System (please complete the following information):
To make editors more aware of the problems, we want to send the links list with the email report so they can judge if very important pages are in trouble. It will be an option (default OFF).
We will provide the Pull request for theses 2 features.
Describe the bug
Uncaught TYPO3 Exception Call to undefined method TYPO3\CMS\Core\Mail\FluidEmail::setReplyTo()
thrown in file /var/www/mysite/public/typo3conf/ext/brofix/Classes/Mail/GenerateCheckResultFluidMail.php
in line 73
To Reproduce
System (please complete the following information):
HTTP status code or timeout may be temporary error.
This might result in the result changing every once in a while.
You could argue that the target is at least flaky and should be checked manually. So current behaviour is actually ok.
Related:
Followup issue for #94 and #106: Add filters to list of broken links.
Some additional changes:
For listing the broken links, it is checked if the editor has write access to the original record where the broken link existed. If he does not have write access, the broken link is not displayed.
Check if this check is also performed for showing number of broken links in the page module!
Before saving a tt_content or a page, we want to add a button next to Close, Save, View, Delete. That way at the initial creation of the content, editors can make sure they don't introduce 404 and broken links.
We thought about making it a "Check link and save" but felt it would be cumbersome and not every save action needs to check for broken links.
We will provide a Pull request for this feature.
Is your feature request related to a problem? Please describe.
The current broken link reports is good for sending to admins, possibly not so good for editors, because:
While this report may be helpful for editors, a specific report may be more helpful
Describe the solution you'd like
Editor would have these settings:
Is your feature request related to a problem? Please describe.
There was context sensitive help in linkvalidator, but it is not currently used.
Describe the solution you'd like
TYPO3 10.4.15
When brofix is installed the clearing caches fails.
Thanks for the helpful extension.
Is fixed in linkvalidator, see https://forge.typo3.org/issues/94381
Describe the bug
For some tables, broken links will not be displayed for editors:
or
To Reproduce
Steps to reproduce the behavior:
Expected behavior
broken links are displayed for admins and non-admins
When editing an element and invoking a recheck, the broken link will be removed. However, if a broken link is removed, it should possibly automatically be checked if other broken links with the same target should get removed as well.
Limit number of records displayed in linkvalidator report or add pagination
In general, content that will be used and is editable in backend should get checked - other content should not:
Find general way to check: for checking if editable FormEngine data group (FormDataGroupInterface) can be used.
cms tools
lowlevel libraries:
command line tools
Online tools:
SEO tools
Some settings should be set once only for the entire installation.
Describe the bug
<a href="https://www.jobb%C3%B6rse.de/">www.jobbörse.de</a>
This link is reported as broken. However, if the link is followed in the frontend, it works.
(the URL resolves to https://www.xn--jobbrse-d1a.de/)
To Reproduce
Expected behavior
Should not be reported as broken.
System (please complete the following information):
When a report is lengthy, we may need to filter results and sort the column by date of checking.
We will provide a pull request for theses features.
Filters to add to the Report screen:
Also, having a sorting capacity on the date of the check (column Checked(url)) can be useful.
Is your feature request related to a problem? Please describe.
Looks really bad on smartphones or small screens.
Describe the solution you'd like
Show a minimal table on small screens. Can still be used with only minimal information, e.g.
Name of element |
---|
action icons |
Name of element |
---|
action icons |
Check if same change must be made in LinkAnalyzer as in core EXT:linkvalidator:
One might also think about not making it not an error (red), but a warning (orange), like for the "mixed" translations.
However, there are 2 reasons, it is recommended to have this (can be pointed out in documentation):
The format of the from email differs between 9 and 10. In 10, it is possible to use something like this: Name <email@from>
, while for 9 it should be only the email address.
The configuration must be reworked and tests added.
There are these options to check, currently:
Current scenario (brofix):
Current scenario (linkvalidator)
First 2 options as in brofix, additionally:
brofix:
linkvalidator
Please be sure to read the following, before you contribute:
We follow TYPO3 core conventions and principles as much as possible. That goes for the coding guidelines / code style, tests as well as best practices. Please familiarize yourself with conventions for the TYPO3 core versions currently supported in current main branch.
The GitHub tests should run automatically. You can check that the testsuite runs successfully, in the "Actions" tab of your fork of "brofix" before you create a pull request (PR).
It is also possible to run the tests locally using the runTests.sh scripts.
If you would like to make bigger changes or add new features, it is recommended to create an issue first and wait for it to be accepted. Also check out the roadmap in #132 if currently, feature request are welcome.
Currently composer install
is run too often. Should be able to reuse. Must be run only once per combination php version and composerInstallMin | composerInstallMax.
Several tests could run sequentially, but then we cannot let tests run in parallel. If possible, reuse artefacts.
There are some required configuration, which should be set. Currently, the user will get no feedback or not get feedback until something fails.
See
In order to do the external link checking, brofix will make HTTP(s) requests to external sites.
In general, the goal is to get recognized as "good bot" and not blocked as "bad bot". To achieve that, we want to use good defaults for the User-agent HTTP header and crawl delay and give recommendations in the documentation. We should not bombard sites rapidly with requests. Some sites may not be able to handle this well.
At the moment we are being pretty nice by having a crawl delay, caching results of external links, and recommending to configure the User-Agent header with contact information. Also - if setup correctly - many concurrent requests to a site should (usually) not occur (because the external link cache is used).
We are however not using the robots.txt currently.
a bot developer should make its bot declare its identity in the user-agent HTTP header when communicating with a site
https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/
bot developers provide a
linkURL in the user-agent header to a page describing the bot, what it's doing, why a site owner should grant it access, and methods a site owner can use to control the bot
https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/
Good bot builders should provide a defensible method to verify a bot is what it declares itself to be
(e.g. is other bot claiming to be our bot, what can site owners do to verify it is our bot)
We recommend that bot makers specify the verification method in the URL provided in the user-agent string.
https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/
follow robots.txt: respect crawl-delay in robots.txt
https://www.darkreading.com/cloud/how-to-live-by-the-code-of-good-bots/a/d-id/1329979
dynamically adapt crawl speed / crawl delay depending on site
keep user agent string simple, no special characters, no encoded characters
BLEXbot is a very site-friendly crawler. We made it as "gentle" as possible when crawling sites: it makes only 1 request per 3 seconds, or even less frequently, if another crawl delay is specified in your robots.txt file. BLEXbot respects rules you specify in your robots.txt file. ...
So far, the following reasons for false positives could be verified:
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html
* SSLLabs shows "chain issues: incomplete" and "extra download"
* to fix on server side: put complete certificate chain in certifcate (including intermediate certificates)
* to fix on client server side (where brofix is running): download intermediate certificates
some URLs are reported as errors even though they work (in browser)
The 999 HTTP error is a Linkedin error. It happens when Linkedin blocks the User-Agent that tries to access a link. I’m afraid it is an issue from the Linkedin, is streets your site as fake User-Agent.
Since the link is not broken, please feel free to set it as “Not Broken” link.
other
Apart from this, all 401, 403 (access restricted URLs) will fail. In that case, it is not really an error, but expected. For these cases, they could either be added as exclude link target entry, or we could make external link type errors configurable (e.g. have an exclude list for that as well, where you could exclude for example 401, 403, maybe also "too many redirects").
see also: https://notes.typo3.org/linkvalidator_problem_external_urls
Related:
necessary steps:
Caveats: extracting and determining if a link is an email link may be not so easy (I already looked into this). Should check first if this is possible, see LinkAnalyzer::findLinksForRecords
Describe the bug
In "connected mode" the translated element effectively inherits the "hidden" state of the original. If this is hidden, the CE should not be displayed in FE.
To Reproduce
brofix.checkhidden=0
Expected behavior
Because checkhidden=0, hidden CE should not get checked. The translated content - even though not displayed as hidden in the backend - is effectively hidden - it inherits the status from the CE of the original language, it will not be displayed in the FE.
Screenshots
System (please complete the following information):
Additional context
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.