Code Monkey home page Code Monkey logo

Comments (3)

mralexgray avatar mralexgray commented on June 11, 2024

For example...

{
  userProfile: {
    fullName: null,
    title: null,
    location: null,
    photo: null,
    description: null,
    url: 'https://www.linkedin.com/in/me/'
  }

the problem likely starts around line 195, and is exhibited by inspecting the value of userProfile variable.

from linkedin-profile-scraper-api.

jvandenaardweg avatar jvandenaardweg commented on June 11, 2024

Thanks for reporting! This is now fixed in master

Changes: #5

from linkedin-profile-scraper-api.

dmmarmol avatar dmmarmol commented on June 11, 2024

Hi everyone!

I'm facing a similar issue and I did check I've got the latest versions from the selectors you pushed in #5 ✔️ .

Judging by the logs, it seems that some "View more" buttons are being missed, given tough that such selectors are correct (manually checked them in a Browser)

export const RequestLinkedin = async ({ language }) => {
    try {
        const scraper = new LinkedInProfileScraper({
            sessionCookieValue: process.env.LI_AT_COOKIE_VALUE,
            keepAlive: process.env.NODE_ENV === 'development',
        });

        // Prepare the scraper
        // Loading it in memory
        await scraper.setup();

        const url = getURL({ language });
        const result = await scraper.run(url);

        return result;
    } catch (err) {
        if (err.name === 'SessionExpired') {
            // Do something when the scraper notifies you it's not logged-in anymore
            throw new Error('SessionExpired');
        }
        return;
    }
};

Logs

Click to expand!
>  Scraper (setup): Launching puppeteer in the background...
>  Scraper (setup): Puppeteer launched!
>  Scraper (setup page): Blocking the following resources: image, media, font, texttrack, object, beacon, csp_report, imageset
>  Scraper (setup page): Should block scripts from 10366 unwanted hosts to speed up the crawling.
>  Scraper (setup page): Setting session cookie using cookie: undefined
>  Scraper (setup page): Session cookie set!
>  Scraper (setup page): Done!
>  Scraper (checkIfLoggedIn): Checking if we are still logged in...
>  Scraper (blocked script): xhr: dpm.demdex.net: https://dpm.demdex.net/id?d_visid_ver=5.1.1&d_fieldgroup=MC&d_rtbd=json&d_ver=2&d_orgid=14215E3D5995C57C0A495C55%40AdobeOrg&d_nsid=0&ts=1614113026516
>  Scraper (blocked script): xhr: dpm.demdex.net: https://dpm.demdex.net/id?d_visid_ver=5.1.1&d_fieldgroup=AAM&d_rtbd=json&d_ver=2&d_orgid=14215E3D5995C57C0A495C55%40AdobeOrg&d_nsid=0&d_mid=49017425936221650123187014327421593955&ts=1614113026536
>  Scraper (blocked script): xhr: dpm.demdex.net: https://dpm.demdex.net/id?d_visid_ver=5.1.1&d_fieldgroup=AAM&d_rtbd=json&d_ver=2&d_orgid=14215E3D5995C57C0A495C55%40AdobeOrg&d_nsid=0&d_mid=49017425936221650123187014327421593955&d_cid_ic=lnkdidsync%01AX1gF6l-GPXUsEfrGZfopE4VFCzS%26v%3D2%011&d_cid_ic=thirdpartyid%01AX1gF6l-GPXUsEfrGZfopE4VFCzS%26v%3D2%011&d_cid_ic=lnkd_member_id%01AX1gF6l-GPXUsEfrGZfopE4VFCzS%26v%3D2%011&ts=1614113026570
>  Scraper (checkIfLoggedIn): All good. We are still logged in.
>  Scraper (setup): Done!
>  Scraper (setup page): Blocking the following resources: image, media, font, texttrack, object, beacon, csp_report, imageset
>  Scraper (setup page): Should block scripts from 10366 unwanted hosts to speed up the crawling.
>  Scraper (setup page): Setting session cookie using cookie: undefined
>  Scraper (setup page): Session cookie set!
>  Scraper (setup page): Done!
>  Scraper (run) (1614113027545): Navigating to LinkedIn profile: https://linkedin.com/in/[USER_PROFILE]/en-US
>  Scraper (run) (1614113027545): LinkedIn profile page loaded!
>  Scraper (run) (1614113027545): Getting all the LinkedIn profile data by scrolling the page to the bottom, so all the data gets loaded into the page...
>  Scraper (run) (1614113027545): Parsing data...
>  Scraper (run) (1614113027545): Expanding all sections by clicking their "See more" buttons
>  Scraper (run) (1614113027545): Clicking button .pv-profile-section.pv-about-section .lt-line-clamp__more
>  Scraper (run) (1614113027545): Clicking button .pv-skill-categories-section [data-control-name="skill_details"]
>  Scraper (run) (1614113027545): Expanding all descriptions by clicking their "See more" buttons
>  Scraper (run) (1614113027545): Clicking button .lt-line-clamp__more[href="#"]:not(.lt-line-clamp__ellipsis--dummy)
>  Scraper (run) (1614113027545): Clicking button .lt-line-clamp__more[href="#"]:not(.lt-line-clamp__ellipsis--dummy)
>  Scraper (run) (1614113027545): Clicking button .lt-line-clamp__more[href="#"]:not(.lt-line-clamp__ellipsis--dummy)
>  Scraper (run) (1614113027545): Clicking button .lt-line-clamp__more[href="#"]:not(.lt-line-clamp__ellipsis--dummy)
>  Scraper (run) (1614113027545): Could not find or click see more button selector "JSHandle@node".
So we skip that one.
>  Scraper (run) (1614113027545): Clicking button .lt-line-clamp__more[href="#"]:not(.lt-line-clamp__ellipsis--dummy)
>  Scraper (run) (1614113027545): Could not find or click see more button selector "JSHandle@node".
So we skip that one.
>  Scraper (run) (1614113027545): Parsing profile data...
>  Scraper (run) (1614113027545): Got user profile data: {"fullName":null,"title":null,"location":null,"photo":null,"description":null,"url":"https://www.linkedin.com/feed/"}
>  Scraper (run) (1614113027545): Parsing experiences data...
>  Scraper (run) (1614113027545): Got experiences data: []
>  Scraper (run) (1614113027545): Parsing education data...
>  Scraper (run) (1614113027545): Got education data: []
>  Scraper (run) (1614113027545): Parsing volunteer experience data...
>  Scraper (run) (1614113027545): Got volunteer experience data: []
>  Scraper (run) (1614113027545): Parsing skills data...
>  Scraper (run) (1614113027545): Got skills data: []
>  Scraper (run) (1614113027545): Done! Returned profile details for: https://linkedin.com/in/[USER_PROFILE]/en-US
>  Scraper (run): Done. Puppeteer is being kept alive in memory.
>  {
>    userProfile: {
>      fullName: null,
>      title: null,
>      location: null,
>      photo: null,
>      description: null,
>      url: 'https://www.linkedin.com/feed/'
>    },
>    experiences: [],
>    education: [],
>    volunteerExperiences: [],
>    skills: []
>  }

from linkedin-profile-scraper-api.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.