Code Monkey home page Code Monkey logo

brofix's People

Contributors

7elix avatar alexander-nitsche avatar alexanderschnitzler avatar andreaskienast avatar astehlik avatar atigiti avatar bmack avatar georgringer avatar helhum avatar ianouf avatar ichhabrecht avatar liayn avatar lolli42 avatar maddy2101 avatar michadu avatar neoblack avatar neufeind avatar ohader avatar pgampe avatar pyguy2 avatar sbuerk avatar sgrossberndt avatar susannemoog avatar sypets avatar thommyhh avatar tmaroschik avatar tmotyl avatar tolleiv avatar wouter90 avatar xperseguers avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

brofix's Issues

Add new tab "Manage exclusions"

We can already manage the "exclusions" records stored in tx_brofix_exclude_link_target with the list module but it's not optimal. We want to add a new tab with those records with few functionnalities:

  • New tab "Manage exclusions"
  • List the exclusions based on "excludeLinkTarget.storagePid" (so the list isn't global for huge pagetrees)
  • Each line starts with a checkbox then the fileds: Linktarge(clickable), reason and hidden (with sorting on each columns)
  • Before the list: Button for CSV export "Export exclusion list" (includes all columns of the table, not just those showned in the interface.
  • Button "Check all records"
  • Button "Delete all checked exclusions" (if all in a way it's a "reset" for the report)

We will provide a Pull Request for these new functions.

Fix EditableRestriction for allowed_languages

Describe the bug

If no allowed languages are defined for a BE user / group ([allowed_languages] is not set), only broken links in records of default language are shown for editors.

To Reproduce

Steps to reproduce the behavior:

  1. Create broken links in content element of not default language (sys_language_uid != 0)
  2. Create BE user with no restrictions to languages they can edit (leave checkboxes for "allowed_languages" empty).
  3. Check for broken links
  4. Compare broken links displayed for the user with broken links displayed for admin user

Only broken links in elements of default language are displayed for non-admin users.

Expected behavior

The editor should also see broken links in the translated elements.

System (please complete the following information):

  • TYPO3 version: [e.g. 10.4.x]
  • brofix version master

Basic principles and goals

  1. Stable and robust (fix bugs before features)
  2. Configurable and extendable
  3. (Almost) zero configuration quickstart
  4. clean code and test suite
  5. User experience, fun to use

Explanations

about 1: Stable and robust (fix bugs before features)

Fix bugs before adding bells and whistles

about 2: Configurable and extendable

The extension should be configurable and extendable, but use defaults that will work well in most usecases.

about 3: (Almost) zero configuration quickstart

It should be possible to start without having to configure and setup a lot. Unfortunately, this is not entirely possible, but we try to use good defaults wherever possible, use the settings from global configuration (where available) and generate defaults.

about 4: Clean code and test suite

We adhere to TYPO3 core conventions as much as possible. This means we also enforce the TYPO3 coding guidelines, for example. Tools exist to check locally. For PR, GitHub actions run and perform the checks.

See CONTRIBUTING.md in this repo for more information, also see commands in composer.json and .github/workflows/ci.yaml.

For new features or bugfixes, tests should be added.

about 5: User experience, fun to use

It should be fun to use or at least a good experience. This means - wherever possible:

  • intuitive to use.
  • useful documentation
  • nice graphics
  • visualize results and show progress (what is happening?)
  • feedback of achievements
  • gamification?

Optimize table of broken links for usability UX

Is your feature request related to a problem? Please describe.

  • table looks cluttered - information overload
  • duplicate information, e.g. if several broken links for the same page / record, the left columns "page", "element", "type" and "checked" all contain the same information
  • back and forth switching between list and edit form feels weird - user looses track
  • DONE: not enough possibility to filter (e.g. filter by URL)

Describe the solution you'd like

More control for the user:

  • DONE: filter (no 1, see link)
  • (optional) user can click away columns (or choose a preset)
  • (optional) change order or columns (no 3, see link)
  • make columns resizable (no 2, see link)
  • inline editing / edit in side view (no 4)
  • fixed header when scrolling
  • only show action icons for selected row (no 19)
  • ...

Also:

  • DONE: paginate
  • find solution for duplicate entries for columns in consecutive rows, example:

Resources

Good resource: https://www.mobilespoon.net/2019/11/design-ui-tables-20-rules-guide.html

Example screenshots

image 1: duplicate information in left columns if more than one broken link in record / on page

image

GUI: List of broken links: should the buttons be in one "Actions" columns or next to the item they belong to?

There are now several action buttons. Where should they be placed?

  1. Context specific: e.g. put the button for the "page layout" in "page" column, button for "edit" in "element column etc.
  2. In one column "Actions"

On the one hand, the context specific looks visually distracting. The buttons are visually visisble as buttons but there is too much text and it takes some getting used to what is just plain information (e.g. language icon, content type icon) and what is an action. So it would be better to group this in an action column. This also corresponds to other views in the core, e.g. the redirects module:

image

On the other hand, there is a distinction between 2 parts of the link: the left part corresponds to where the link is (the content element on the page), the right part corresponds to the link target (the URL). It is clearer to put the action buttons where they belong.

image

Add dashbord widget?

what can be displayed?

  • number of broken links: total
  • number of broken links on content editable by user
  • next goal

Inspiration

I very much like the German vacination dashboard: https://impfdashboard.de/

Some principles / ideas are explained on Twitter by Moritz Stefaner, e.g.:

image

image

Information - how to implement

  1. make dashboard widget optional - if ext:dashboard is installed. see

Tests run twice if feature branch created in this repo (not in fork)

Describe the bug

Before the CI was run twice if a pull request is created from a feature branch within the same repository (not a fork). The changes follow the recommendation given by @isaac in When Should CI Run? chapter of his blog post: CI with GitHub Actions for Ember Apps

as in

  • https://github.com/jelhan/create-github-actions-setup-for-ember-addon/pull/7
  • https://github.com/jelhan/create-github-actions-setup-for-ember-addon/issues/5

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

Run tests only once

Additional context

Could also be resolved by creating a fork and pushing directly to master

File link not marked in RTE as broken

URL in tx_brofix* Table is not the same URL in RTE with the result that the link is not marked as broken in RTE.

todo: has to be tested again with latest version.

Link in RTE:
https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1535103367&Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf

Link in tx_brofix* Table:
https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1535103367&Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf

Snippet in RTE:
<p>30)&nbsp;Kreutz, G.&nbsp;(2014). Does singing facilitate social bonding?&nbsp;<em><a href="https://s3.amazonaws.com/academia.edu.documents/43377505/Music__Medicine__2014__Volume_6__Issue_220160304-7044-13ng0hf.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&amp;Expires=1535103367&amp;Signature=bdOqyAP0p9T3La3X%2B2DymJgELPk%3D&amp;response-content-disposition=inline%3B%20filename%3DMusic_and_Medicine_2014_Volume_6_Issue_2.pdf">Music and Medicine</a>, 6</em>(2), 51-60.</p>

Clean up GUI

  • Some texts are too long (e.g. complete root line)

[BUG] Depth will not be set correctly via CLI if depth=0 is set

Describe the bug

Depth=0 is interpreted as unset option and the default is used.

To Reproduce

  1. Call CLI command with depth=0

vendor/bin/typo3 brofix:checklinks -p 1 -d 0 --dry-run

The default depth (999) or the depth set via Page TSconfig is used. This is not correct.

Expected behavior

A depth of 0 should be used.

System (please complete the following information):

  • TYPO3 version: 10.4.17
  • brofix version: 2.1.1-dev
  • installed with Composer: no

Add new emails option : links list

To make editors more aware of the problems, we want to send the links list with the email report so they can judge if very important pages are in trouble. It will be an option (default OFF).

  • Add new TSconfig value: mod.brofix.email.addLinks = 0 (default)
  • In the email list URL or uid of the page with the broken link and the link being in trouble
  • Make sure using the CLI to send email reports also have the option working
  • After the statistics and before the links, add a link to the BE so editors can just click on it. If v11, make it a link to the Info Module with Brofix already selected.

We will provide the Pull request for theses 2 features.

Exception on link checking with replyto email set

Describe the bug

Uncaught TYPO3 Exception Call to undefined method TYPO3\CMS\Core\Mail\FluidEmail::setReplyTo()
thrown in file /var/www/mysite/public/typo3conf/ext/brofix/Classes/Mail/GenerateCheckResultFluidMail.php
in line 73

To Reproduce

  1. set replyto email in TSconfig
  2. start link checking via console command

System (please complete the following information):

  • TYPO3 version: 10.4.17
  • brofix version: 2.1.0 (latest)
  • installed with Composer: yes

Further enhancements for filter (in list of broken links)

Followup issue for #94 and #106: Add filters to list of broken links.

Some additional changes:

  • add label to input fields
  • shorten length of UID input field
  • make it possible to choose wildcard match / exact match / regex (at least for the URL)

image

  • make it possible to fill the filter by clicking, e.g on the id, the title, the URL

image
image

  • additional field to choose link type (e.g. page link, external)
  • additional field to filter error message
  • compare with current core version for styling and accessibility (e.g. redirects module)

Check if records are editable is not performed in hook for page module

For listing the broken links, it is checked if the editor has write access to the original record where the broken link existed. If he does not have write access, the broken link is not displayed.

Check if this check is also performed for showing number of broken links in the page module!

Add a "Check links" button in Docheader for Pages and tt_content

Before saving a tt_content or a page, we want to add a button next to Close, Save, View, Delete. That way at the initial creation of the content, editors can make sure they don't introduce 404 and broken links.

We thought about making it a "Check link and save" but felt it would be cumbersome and not every save action needs to check for broken links.

We will provide a Pull request for this feature.

Broken link report for editors

Is your feature request related to a problem? Please describe.

The current broken link reports is good for sending to admins, possibly not so good for editors, because:

  • it considers the entire site, not just the content the editor has access to

While this report may be helpful for editors, a specific report may be more helpful

Describe the solution you'd like

  • create a report for editors. This can be a separate CLI command
  • an indiviudual report is sent to each editor
  • no need to configure the list of email recipients, the email in be_users is used
  • editors can activate / deactivate this in their user settings
  • A similar report is sent as the (current) one for admins, but only the content the editor has access to is considered
  • Also send a list of broken links as suggested by @gaumondp in #96 (possibly reuse that part, once it is merged)
  • (optional) send only if new broken links are detected (should also be configurable in user settings
  • (optional) send only list of new broken links (or sort list by crdate)? (configurable)

Editor would have these settings:

  • Receive regular broken link report via email: yes / no
  • Receive broken link report only if new broken links are detected: yes/ no

Utilize context sensitive help

Is your feature request related to a problem? Please describe.

There was context sensitive help in linkvalidator, but it is not currently used.

Describe the solution you'd like

  • checkout context sensitive help in general, e.g. look at BackendUtility::wrapInHelp
  • possibly reuse existing Resources/Private/Language/Module/locallang_csh.xlf

Clearing caches fails

TYPO3 10.4.15
When brofix is installed the clearing caches fails.


Thanks for the helpful extension.

Fix EditableRestriction - allow / deny types is applied incorrectly

Is fixed in linkvalidator, see https://forge.typo3.org/issues/94381


Describe the bug

For some tables, broken links will not be displayed for editors:

  • table with a ctrl => type field
  • field does not have "authMode" or "authMode_enforce" === "strict"

or

  • type field contains "0"

To Reproduce

Steps to reproduce the behavior:

  1. install news
  2. configure tx_news_domain_model_news records to get checked (e.g. "bodytext")
  3. compare the list of broken links for admins and non-admins

Expected behavior

broken links are displayed for admins and non-admins

Check only fields that are editable and actually being used

In general, content that will be used and is editable in backend should get checked - other content should not:

  • Do not check fields based on type (e.g. doktype,ctype), e.g. do not check bodytext on pages with type 'list'
  • do not check content elements on pages with doktype 3 or 4 (will not be displayed in frontent, but will be displayed in backend)
  • do not check default language if "Hide default language of page" is selected
  • do not check content elements in hidden gridelements
  • etc.

Find general way to check: for checking if editable FormEngine data group (FormDataGroupInterface) can be used.

checkout how other tools do link checking

cms tools

lowlevel libraries:

command line tools

  • curl (TYPO3 / Guzzle uses curl under the hood)

Online tools:

SEO tools

Urlencoded link with umlauts is falsely reported as broken

Describe the bug

<a href="https://www.jobb%C3%B6rse.de/">www.jobbörse.de</a>

This link is reported as broken. However, if the link is followed in the frontend, it works.

(the URL resolves to https://www.xn--jobbrse-d1a.de/)

To Reproduce

  1. Create a link with the target https://www.jobb%C3%B6rse.de/
  2. Do link checking: this link will be reported as broken

Expected behavior

Should not be reported as broken.

System (please complete the following information):

  • TYPO3 version: 10.4.17
  • brofix version: latest master
  • installed with Composer: no

Add filters and sorting on date of check in Report

When a report is lengthy, we may need to filter results and sort the column by date of checking.

We will provide a pull request for theses features.

Filters to add to the Report screen:

  • Title
  • Uid
  • Source(url)

Also, having a sorting capacity on the date of the check (column Checked(url)) can be useful.

  • Add sorting to column Checked(url)

Show minimal table on small screens

Is your feature request related to a problem? Please describe.

Looks really bad on smartphones or small screens.

image

Describe the solution you'd like

Show a minimal table on small screens. Can still be used with only minimal information, e.g.

Name of element
action icons
Name of element
action icons

Resources

Make notice in page module configurable

  • page_callouts should be not a hard requirement but rather a "suggest"
  • even if page_callouts installed, should be made configurable

One might also think about not making it not an error (red), but a warning (orange), like for the "mixed" translations.

image


However, there are 2 reasons, it is recommended to have this (can be pointed out in documentation):

  • it should guide editors to broken link list and not fix broken links in page module: this way, if records are edited by clicking the edit pencil in broken link list, the links are checked again automatically when returning to list. (If you edit via the page module, this does currently not happen).
  • when you click on second action button (page layout)
    image
    there is currently no way to go back to list directly
    image
    (not possible in core), so this is helpful to jump back to link list.

[BUG] Problem with email format

The format of the from email differs between 9 and 10. In 10, it is possible to use something like this: Name <email@from>, while for 9 it should be only the email address.

The configuration must be reworked and tests added.

Improve when and how links are checked

main goals

  1. Freshness of broken links results (or at least visual feedback about how fresh the result is).
  2. Avoid excessive link checking

current scenario

There are these options to check, currently:

Current scenario (brofix):

  1. check in console command (regular full check)
  2. check after editing a record from the broken link list
  3. (only in brofix): have a "recheck" button for every broken link. This also overrides the link target cache.

Current scenario (linkvalidator)

First 2 options as in brofix, additionally:

  1. (only in linkvalidator): "Check link" tab

Current problems

brofix:

  • if a record is edited (not from the broken link list), the broken links information is not updated
    • exception: if a record is hidden or deleted, broken link records are immediately removed (in brofix, not linkvalidator)

linkvalidator

  • with the checklinks tab, you can excessively check links (including with level) infinite and unnecessarily check links over and over again

The real problems

  • Currently, there is no feedback, whether list of broken links is up to date
  • We want up-to-date "fresh" broken links information, but we do not want to excessively check for broken links as it puts load on the site

Contributions are welcome!

Please be sure to read the following, before you contribute:

We follow TYPO3 core conventions and principles as much as possible. That goes for the coding guidelines / code style, tests as well as best practices. Please familiarize yourself with conventions for the TYPO3 core versions currently supported in current main branch.

The GitHub tests should run automatically. You can check that the testsuite runs successfully, in the "Actions" tab of your fork of "brofix" before you create a pull request (PR).

It is also possible to run the tests locally using the runTests.sh scripts.

If you would like to make bigger changes or add new features, it is recommended to create an issue first and wait for it to be accepted. Also check out the roadmap in #132 if currently, feature request are welcome.

Optimize CLI runs (reuse results)

Currently composer install is run too often. Should be able to reuse. Must be run only once per combination php version and composerInstallMin | composerInstallMax.

Several tests could run sequentially, but then we cannot let tests run in parallel. If possible, reuse artefacts.

Add sanity check for configuration

There are some required configuration, which should be set. Currently, the user will get no feedback or not get feedback until something fails.

  • mail recipients and fromemail should be set (if mail is configured to be sent
  • User agent should be filled out (but will be autogenerated if not set)
  • storagePid should be set to != 0 if editors should be able to create exclude entries

See

Revisit recommendations for good bots

In order to do the external link checking, brofix will make HTTP(s) requests to external sites.

In general, the goal is to get recognized as "good bot" and not blocked as "bad bot". To achieve that, we want to use good defaults for the User-agent HTTP header and crawl delay and give recommendations in the documentation. We should not bombard sites rapidly with requests. Some sites may not be able to handle this well.

At the moment we are being pretty nice by having a crawl delay, caching results of external links, and recommending to configure the User-Agent header with contact information. Also - if setup correctly - many concurrent requests to a site should (usually) not occur (because the external link cache is used).

We are however not using the robots.txt currently.

Relevant settings

  • User-Agent
  • concurrent requests and wait time between requests (currently, the min wait time between 2 requests from same site is 5 seconds but this is configurable)

Recommendations

a bot developer should make its bot declare its identity in the user-agent HTTP header when communicating with a site

https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/

bot developers provide a link URL in the user-agent header to a page describing the bot, what it's doing, why a site owner should grant it access, and methods a site owner can use to control the bot

https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/

Good bot builders should provide a defensible method to verify a bot is what it declares itself to be
(e.g. is other bot claiming to be our bot, what can site owners do to verify it is our bot)
We recommend that bot makers specify the verification method in the URL provided in the user-agent string.

https://www.perimeterx.com/resources/blog/2017/live-by-code-of-good-bots/

follow robots.txt: respect crawl-delay in robots.txt

https://www.darkreading.com/cloud/how-to-live-by-the-code-of-good-bots/a/d-id/1329979

dynamically adapt crawl speed / crawl delay depending on site

  • e.g. fast request: less wait time, slow response - long wait time

keep user agent string simple, no special characters, no encoded characters

Resources

BLEXbot is a very site-friendly crawler. We made it as "gentle" as possible when crawling sites: it makes only 1 request per 3 seconds, or even less frequently, if another crawl delay is specified in your robots.txt file. BLEXbot respects rules you specify in your robots.txt file. ...

Examples

Fix problem with false positives (external URLs)

todos

  • in brofix: provide information in documentation, but it is not brofix's job to download intermediate certificates
  • make it possible to exclude specific error types (e.g. curl(60), but this usually has problem, that legitimate errors are not displayed as well
  • use some heuristic for better detection of probable false positives (difficult, because has several reasons and several error types can be affected)

summary

So far, the following reasons for false positives could be verified:

  1. certificate chain issue (this is actually an error on the server side of the webserver which is checked, but it is a minor error and page can be loaded without warning in browser, so this is perceived (!) as not broken by user (this should be distinguished from other TLS security isssues, such as outdated certificate etc.)
    • error is usually curl 60 (can be verified by using curl -I "url" on server

curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

* SSLLabs shows "chain issues: incomplete" and "extra download"
* to fix on server side: put complete certificate chain in certifcate (including intermediate certificates)
* to fix on client server side (where brofix is running): download intermediate certificates
  1. cloudflare
    • error 503 is given

problem description

some URLs are reported as errors even though they work (in browser)

Examples:

The 999 HTTP error is a Linkedin error. It happens when Linkedin blocks the User-Agent that tries to access a link. I’m afraid it is an issue from the Linkedin, is streets your site as fake User-Agent.
Since the link is not broken, please feel free to set it as “Not Broken” link.


  • twitter URLs

other

  • ...

Apart from this, all 401, 403 (access restricted URLs) will fail. In that case, it is not really an error, but expected. For these cases, they could either be added as exclude link target entry, or we could make external link type errors configurable (e.g. have an exclude list for that as well, where you could exclude for example 401, 403, maybe also "too many redirects").


see also: https://notes.typo3.org/linkvalidator_problem_external_urls

Related:

Add check for email links (validate email address)

necessary steps:

Caveats: extracting and determining if a link is an email link may be not so easy (I already looked into this). Should check first if this is possible, see LinkAnalyzer::findLinksForRecords

Translated elements of hidden elements should not get checked (in connected mode)

Describe the bug

In "connected mode" the translated element effectively inherits the "hidden" state of the original. If this is hidden, the CE should not be displayed in FE.

To Reproduce

  1. Make sure brofix.checkhidden=0
  2. Create some content on a page with one broken link, translate it in connected mode ("Translate", not copy)
  3. Deactivate the CE of the original language
  4. check the links on the page with brofix
  5. Look at the report. You will now see the broken link for the content on the translated page

Expected behavior

Because checkhidden=0, hidden CE should not get checked. The translated content - even though not displayed as hidden in the backend - is effectively hidden - it inherits the status from the CE of the original language, it will not be displayed in the FE.

Screenshots

System (please complete the following information):

  • TYPO3 version: 10.4.15
  • brofix version: master
  • page_callouts version [e.g. 1.0.0]
  • installed with Composer: no

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.