Code Monkey home page Code Monkey logo

bookstorescraper's Introduction

๐Ÿ‘‹ I'm Marina ๐Ÿ˜Š

I'm a Software Developer, mainly working with C#/.NET and the Web. ๐Ÿ‘ฉโ€๐Ÿ’ปโœจ I also have an interest in Software Architecture and Design.

Linkedin Badge Twitter Badge Github Badge YouTube Badge

๐ŸŽฏ My Programming Timeline recaps my life and career as a programmer and professional software developer.

Here is my CV ๐Ÿ˜Š

Technologies that I use

Me on GitHub

Marina's github stats Top Langs

"Buy Me A Coffee"

โญ Project showcase

Here are some projects of mine that I would like to show:

  • YourBrand (2021 - 2023, 2024 -) - Enterprise system for e-commerce and consulting services. Distributed app with deployment to the cloud. Based on and incorporates the following projects:
    • YourBrand 3.0 (2023 - 2024) - E-commerce site/system for the cloud. Based on eShop.
    • eShop (2022 - 2023) - E-commerce site/system. Based on the YourBrand and "Todo app" projects.
    • Product Catalog (2021) - A product catalog and configurator.
    • Time Report (2021) - Project management app with time reporting functionality.
    • Finance app (2021 - 2022) - App for doing finance, in particular accounting. Some Invoicing.
    • Showroom (2018 - 2022) - Site helping consultancy company presenting consultants to customers.
  • Todo app (2022) - Reference project for Clean Architecture with focus on Use Cases.
  • Tigergenerator 2.0 (2020) - Web app for generating satirical images with the tiger from "En Svensk Tiger". Remake of Tigergenerator.
  • Point Of Sale (2021) - Electronic cash register app with product catalog and receipts.
  • Commuter (2019) - Conceptual commuter app listing stops and departures based on the user's position.
  • Audio Player app (2017) - Concept mobile audio player app for Axis audio products. C#/.NET, Xamarin.Forms
  • Access Control (2017 - 2018) - Physical Access Control system. C#/.NET, Raspberry Pi, Azure, Web app, Mobile Apps

Here is a ๐Ÿ”— playlist showing some of my projects.

โœ”๏ธ Job Interview Assignments

I have uploaded the result of some of the assignments that I have done for interviews.

  • Agent Recruiter (2020) - App for matching recruiter with secret agents. Tinder-like swipe interface. C#/.NET, Xamarin.Forms
  • FileViewer (2020) - Explorer-type web app visualizing and manipulating a virtual filesystem. C#/.NET & Blazor
    • FileViewer2 (2021) - Second attempt. Rewrite from scratch with MudBlazor component library.
  • RobotApp (2022) - Web app controlling a robot on screen by giving commands. HTML, JavaScript, and Canvas (C# version)
  • Snake (2022) - Implementation of the classic game Snake as a Web App, using HTML, JavaScript, and Canvas
  • BookStoreScraper (2023) - Scraping the Books to Scrape site, and downloads the entire site to be viewed offline. C#/.NET
  • ChatApp (2023) - Chat app built with ASP.NET Core and Blazor.
  • FizzBuzz (2024) - Fizz buzz with Unit tests

๐Ÿซ School projects

  • C Micro compiler (2008 - 2009) - My first ever attempt att building a compiler. C-like language.
  • VB Lite compiler (2011) - Compiler for a Visual Basic.NET-like language. Loosely-based on Mono C# compiler architecture.

Other compiler and parser projects

  • ExpressionEvaluator (2016) - Expression parser, evaluator, and compiler. Using the operator-precendence parser algorithm and Reflection.Emit for code generation.
  • Compiler projects (2022 - 2023) - A couple of compiler projects for prototyping using a modern compiler architecture.

Experiments

  • BlazorMinimalApiTest (2023) - Experiment rendering interactive components Minimal API endpoints in ASP.NET Core 8 RC. Not supported by default.
  • BlazorPhp (2023) - Experiment adding Blazor component to Peachpie PHP project. Based on ASP.NET Core 8 RC1. Uses reflection hacks.

Misc projects

  • MAUI Blazor hybrid app, with Fluent UI, and ASP.NET Core backend (2024)
  • Blazor reference app for .NET 8 with Bootstrap 5. Cloud-ready template project with documentation (2023).
  • rabbitmq-java-test (2023) - Exploring RabbitMQ in Java. With Docker containerization.
  • Blazor Basics (2022) - Contains samples demonstrating various concepts in Blazor. How component binding works, and how to do JavaScript interop.
  • .NET IoT samples (2018) - A collection of samples intended to run on Raspberry Pi.
  • HttpListener library for .NET Core 1 (2016) - Abstraction on top of TCP Listener that was built to fill the then lack of such an API.
  • x86-encoder (2013) - A library for generating X86 machine instructions.

โ–ถ๏ธ Tech Talks

The full playlist can be found here.

๐Ÿ”— Where to find me

bookstorescraper's People

Contributors

marinasundstrom avatar

Stargazers

 avatar

Watchers

 avatar

bookstorescraper's Issues

Refactoring

Refactoring is on-going in develop branch.

Directive

  • To facilitate unit testing.
  • To make more performant so it completes faster.

To Do

  • Extracting processor methods into separate classes

    • Each having their own context
    • Allow for concurrent execution to minimize time
  • Write unit tests

Stretch

  • Investigate whether the readFiles HashSet or File.Exists is thed most performant in time and memory allocations.

Path separators are platform specific

The user encountered this while running on Windows.

It seems like the Uri class cannot handle URIs with backwards slashes \ which are used in Windows.

The Path API, which are used for Path manipulation, conforms the the host platform. So on Windows, it uses \.

This program was developed on a Mac which used / in filesystem paths.

Question: Is there a better method than using the platform specific Path.

info: BookScraper.NavigationManager[0]
      Navigated to: http://books.toscrape.com/index.html
info: BookScraper.Processors.DocumentProcessor[0]
      Downloading: http://books.toscrape.com/index.html
info: BookScraper.MyHttpClient[0]
      Downloading: http://books.toscrape.com/index.html
info: BookScraper.Processors.DocumentProcessor[0]
      Processing document
info: BookScraper.Processors.DocumentProcessor[0]
      Found script: http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js
info: BookScraper.Processors.DocumentProcessor[0]
      Found script: static/oscar/js/bootstrap3/bootstrap.min.js
Unhandled exception. System.UriFormatException: Invalid URI: The hostname could not be parsed.
   at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind, UriCreationOptions& creationOptions)
   at System.Uri..ctor(String uriString)
   at BookScraper.UrlHelpers.AsAbsoluteUrl(String baseUrl, String currentUrl, String relUrl) in C:\code\assignments\marinasundstrom\BookScraper\UrlHelpers.cs:line 15
   at BookScraper.Context.AsAbsoluteUrl(String relUrl) in C:\code\assignments\marinasundstrom\BookScraper\Context.cs:line 17
   at BookScraper.Processors.ScriptProcessor.ProcessScriptSrc(Context context, String scriptElementSrc, CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Processors\ScriptProcessor.cs:line 26
   at BookScraper.Processors.DocumentProcessor.ProcessScripts(Context context, IDocument document, CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Processors\DocumentProcessor.cs:line 124
   at BookScraper.Processors.DocumentProcessor.ProcessHtml(Context context, IDocument document, CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Processors\DocumentProcessor.cs:line 102
   at BookScraper.Processors.DocumentProcessor.ScrapeDocument(Context context, CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Processors\DocumentProcessor.cs:line 95
   at BookScraper.Processors.DocumentProcessor.ScrapeDocument(String url, CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Processors\DocumentProcessor.cs:line 45
   at BookScraper.Scraper.Scrape(CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\Scraper.cs:line 64
   at BookScraper.ScraperHostedService.StartAsync(CancellationToken cancellationToken) in C:\code\assignments\marinasundstrom\BookScraper\ScraperHostedService.cs:line 15
   at Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Program.<Main>$(String[] args) in C:\code\assignments\marinasundstrom\BookScraper\Program.cs:line 26
   at Program.<Main>(String[] args)

Completion message is not printed in console

Probable cause: Logger doesn't always get to flush before the program exits.

The logger works by letting another thread asynchronously write buffered messages to the console.

Investigate Uri allocations

A lot of Uri class objects are created to either manipulate or just parse URLs.

They live just for a short while, pile up, and require Garbage Collection.

Can those allocations be reduced?

Cannot create Output folder

User tried to run on Windows. Got this exceptions,

Seems to be about creating the Output folder.

They tested to create the Output folder manually, and the program continued. (with unrelated exceptions)

Unhandled exception. System.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\code\assignments\marinasundstrom\BookScraper\bin\Debug\net7.0\Output'.
   at System.IO.FileSystem.GetFindData(String fullPath, Boolean isDirectory, Boolean ignoreAccessDenied, WIN32_FIND_DATA& findData)
   at System.IO.FileSystem.RemoveDirectory(String fullPath, Boolean recursive)
   at BookScraper.Scraper.SetupDirectoryStructure() in C:\code\assignments\marinasundstrom\BookScraper\Scraper.cs:line 48
   at BookScraper.Scraper..ctor(DocumentProcessor documentProcessor, ErrorSink errorSink, ILogger`1 logger, IHostApplicationLifetime hostApplicationLifetime) in C:\code\assignments\marinasundstrom\BookScraper\Scraper.cs:line 37
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.ConstructorInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitConstructor(ConstructorCallSite constructorCallSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitScopeCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitConstructor(ConstructorCallSite constructorCallSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitIEnumerable(IEnumerableCallSite enumerableCallSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite callSite, ServiceProviderEngineScope scope)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(Type serviceType)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type serviceType, ServiceProviderEngineScope serviceProviderEngineScope)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider provider, Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider provider)
   at Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Program.<Main>$(String[] args) in C:\code\assignments\marinasundstrom\BookScraper\Program.cs:line 26
   at Program.<Main>(String[] args)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.