Code Monkey home page Code Monkey logo

uasurfer's Introduction

Build Status GoDoc Go Report Card

uasurfer

uasurfer-100px

User Agent Surfer (uasurfer) is a lightweight Golang package that parses and abstracts HTTP User-Agent strings with particular attention to device type.

The following information is returned by uasurfer from a raw HTTP User-Agent string:

Name Example Coverage in 192,792 parses
Browser name chrome 99.85%
Browser version 53 99.17%
Platform ipad 99.97%
OS name ios 99.96%
OS version 10 98.81%
Device type tablet 99.98%

Layout engine, browser language, and other esoteric attributes are not parsed.

Coverage is estimated from a random sample of real UA strings collected across thousands of sources in US and EU mid-2016.

Usage

Parse(ua string) Function

The Parse() function accepts a user agent string and returns UserAgent struct with named constants and integers for versions (minor, major and patch separately), and the full UA string that was parsed (lowercase). A string can be retrieved by adding .String() to a variable, such as uasurfer.BrowserName.String().

// Define a user agent string
myUA := "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36"

// Parse() returns all attributes, including returning the full UA string last
ua, uaString := uasurfer.Parse(myUA)

where example UserAgent is:

{
    Browser {
        BrowserName: BrowserChrome,
        Version: {
            Major: 45,
            Minor: 0,
            Patch: 2454,
        },
    },
    OS {
        Platform: PlatformMac,
        Name: OSMacOSX,
        Version: {
            Major: 10,
            Minor: 10,
            Patch: 5,
        },
    },
    DeviceType: DeviceComputer,
}

Usage note: There are some OSes that do not return a version, see docs below. Linux is typically not reported with a specific Linux distro name or version.

Browser Name

Browser Version

Browser version returns an unint8 of the major version attribute of the User-Agent String. For example Chrome 45.0.23423 would return 45. The intention is to support math operators with versions, such as "do XYZ for Chrome version >23".

Unknown version is returned as 0.

Platform

  • PlatformWindows - Microsoft Windows
  • PlatformMac - Apple Macintosh
  • PlatformLinux - Linux, including Android and other OSes
  • PlatformiPad - Apple iPad
  • PlatformiPhone - Apple iPhone
  • PlatformBlackberry - RIM Blackberry
  • PlatformWindowsPhone Microsoft Windows Phone & Mobile
  • PlatformKindle - Amazon Kindle & Kindle Fire
  • PlatformPlaystation - Sony Playstation, Vita, PSP
  • PlatformXbox - Microsoft Xbox
  • PlatformNintendo - Nintendo DS, Wii, etc.
  • PlatformUnknown - Unknown

OS Name

  • OSWindows
  • OSMacOSX - includes "macOS Sierra"
  • OSiOS
  • OSAndroid
  • OSChromeOS
  • OSWebOS
  • OSLinux
  • OSPlaystation
  • OSXbox
  • OSNintendo
  • OSUnknown

OS Version

OS X major version is always 10 for releases prior to Big Sur with consecutive minor versions indicating release releases (10 - Yosemite, 11 - El Capitain, 12 Sierra, etc). macOS Big Sur is indicated as {11, 1, 0}. Windows version is NT version. Version{0, 0, 0} indicated version is unknown or not evaluated. Versions can be compared using Less function: if ver1.Less(ver2) {}

Here are some examples across the platform, os.name, and os.version:

  • For Windows XP (Windows NT 5.1), "PlatformWindows" is the platform, "OSWindows" is the name, and {5, 1, 0} the version.
  • For OS X 10.5.1, "PlatformMac" is the platform, "OSMacOSX" the name, and {10, 5, 1} the version.
  • For Android 5.1, "PlatformLinux" is the platform, "OSAndroid" is the name, and {5, 1, 0} the version.
  • For iOS 5.1, "PlatformiPhone" or "PlatformiPad" is the platform, "OSiOS" is the name, and {5, 1, 0} the version.
Windows Version Guide
  • Windows 10 - {10, 0, 0}
  • Windows 8.1 - {6, 3, 0}
  • Windows 8 - {6, 2, 0}
  • Windows 7 - {6, 1, 0}
  • Windows Vista - {6, 0, 0}
  • Windows XP - {5, 1, 0} or {5, 2, 0}
  • Windows 2000 - {5, 0, 0}

Windows 95, 98, and ME represent 0.01% of traffic worldwide and are not available through this package at this time.

DeviceType

DeviceType is typically quite accurate, though determining between phones and tablets on Android is not always possible due to how some vendors design their UA strings. A mobile Android device without tablet indicator defaults to being classified as a phone. DeviceTV supports major brands such as Philips, Sharp, Vizio and steaming boxes such as Apple, Google, Roku, Amazon.

  • DeviceComputer
  • DevicePhone
  • DeviceTablet
  • DeviceTV
  • DeviceConsole
  • DeviceWearable
  • DeviceUnknown

Example Combinations of Attributes

  • Surface RT -> OSWindows8, DeviceTablet, OSVersion >= 6
  • Android Tablet -> OSAndroid, DeviceTablet
  • Microsoft Edge -> BrowserIE, BrowserVersion >= 12.0.0

To do

  • Remove compiled regexp in favor of string.Contains wherever possible (lowers mem/alloc)
  • Better version support on Firefox derivatives (e.g. SeaMonkey)
  • Potential additional browser support:
  • "NetFront" (1% share in India)
  • "Sogou Explorer" (5% share in China)
  • "Maxthon" (1.5% share in China)
  • "Nokia"
  • Potential additional OS support:
  • "Nokia" (5% share in India)
  • "Series 40" (5.5% share in India)
  • Windows 2003 Server
  • iOS safari browser identification based on iOS version
  • Add android version to browser identification
  • old Macs
  • "opera/9.64 (macintosh; ppc mac os x; u; en) presto/2.1.1"
  • old Windows
  • "mozilla/5.0 (windows nt 4.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/37.0.2049.0 safari/537.36"

uasurfer's People

Contributors

c2h5oh avatar diligiant avatar dorokhov avatar dvrkps avatar edengillies-lumen avatar erikdubbelboer avatar healiha avatar iand avatar jacobpierce avatar jfie5 avatar lziest avatar naoto0822 avatar ryanslade avatar scritchley avatar snawoot avatar vavrusa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

uasurfer's Issues

MacOS Big Sur not detected

The library is unable to detect macOS Big Sur (11). A user agent example is the following:
Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36

Does anyone have an update to support this?

Project Abandoned?

Looks like this project was abandoned 3 years ago. 21 issues filed, 9 pull requests.

Anyone interested in forking/maintaining?

Support "dalvik" android strings

reports browser name & browser version unknown while running from dalvik VM

dalvik/2.1.0 (linux; u; android 6.0; e5633 build/30.2.b.0.100)
dalvik/1.6.0 (linux; u; android 4.4.2; lenovo a5500-h build/kot49h)
dalvik/1.6.0 (linux; u; android 4.4.2; ixion xl145 snatch build/kot49h)

image

HeadlessChrome not counted as bot

Hello,

in my use case - a web analytics WbApp - I expect user agents marked as HeadlessChrome to not be humans and therefore a bot.

Example of such an user agent:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/78.0.3882.0 Safari/537.36

thanks!

ios device isn't detected

In my case, the following should be detected as an ios device and CocCoc browser
CocCoc/95.0.212 CFNetwork/1128.0.1 Darwin/19.6.0

confirm whether this iPhone UA is a bot

    // possibly a bot, unconfirmed -- lacking "Safari/xx"
    // {"Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Mobile/10A5376e",
    //  BrowserSafari, 6, PlatformiPhone, OSiOS, 6, DevicePhone},

Bot device detection is always DeviceComputer

Hi,

We've noticed that when the User-Agent is detected as bot is it always set as device type DeviceComputer.
This is not always correct. The GoogleBot for instance can crawl the page using different User-Agent for mobile or desktop.
See https://support.google.com/webmasters/answer/1061943?hl=en.
This library should reflect this when parsing a User-Agent. If the bot is crawling as a mobile phone it should be detected as DevicePhone.

I will submit a proposal as PR for fixing this.

Cheers,
Thomas

Parsing string and returning uasurfer types

We currently have a problem with parsing types from database/sql.Row.Scan. We can deal with it by implemeting our own Scanner or it would be nice if this package can have like the following?

func ParseDeviceType(deviceType string) (dt uasurfer.ParseDeviceType, err error) {
  // Parsing work here
  return
}

UA generation

Hi there,

it seems now you can parse a UA string, but you can't build it.
Are there any plans to introduce UA generation?

Feature: Implement sql/driver.Values and Scanner for database compatibility

First of all, thanks for this package because it has helped us in a lot of ways.

Current Problem

For strong data typing purposes, our structs that are related to what we insert into the database is using the uasurfer types. Combining with sqlx, this is how we get data.

type Clicks struct {
  OSName uasurfer.OSName `db: "os_name"`
}
clicks := []Clicks
if err := conn.Select(&clicks, "SELECT os_name action_time FROM clicks"); err != nil {
  log.Fatal(err)
}

Error.

"os_name": unsupported Scan, storing driver.Value type string into type *uasurfer.OSName

My solution / the asked Feature

I had to fork this package and I've implemented the sql/driver.Valuer and sql/driver.Scanner to the uasurfer types.

Edg on different platforms

Now that Edge was on Chromium is out, I noticed that, for good reasons, its identification was missing (see User Agent String on edge preview) (Edg/…).

As linked in the aforementioned page, I discovered that Edge for iOS and Android also also not identified (edge ios android (EdgiOS/… & EdgA/…).

safari on webos

useragent: Mozilla/5.0 (webOS/2.0.1; U; en-US) AppleWebKit/532.2 (KHTML, like Gecko) Version/1.0 Safari/532.2 Pre/1.2

safari on webos shows BrowserUnknown due to:
case strings.Contains(ua, "like gecko") && strings.Contains(ua, "mozilla/") && strings.Contains(ua, "safari/") && !strings.Contains(ua, "linux") && !strings.Contains(ua, "android") && !strings.Contains(ua, "browser/") && !strings.Contains(ua, "os/") && !strings.Contains(ua, "yabrowser/"):

more exactly !strings.Contains(ua, "os/")
also fixed if replaced with !(strings.Contains(ua, "os/") && !strings.Contains(ua, "webos/"))

Test data access

Can you share the random sample of real UA strings collected across thousands of sources in US and EU mid-2016 mentioned in readme?

I wanted to add some bits things to uasurfer (expect pull requests soon), but I'd rather test it on my own first. I could get the list from our own logs, but I fear that the sample would not be representative.

Remove prefix from parsed ua string

Following is the parsed output of an UA string. What is the need for prefixing type name to actual value, like in BrowserName isn't it better to have just Chrome and not BrowserChrome?

Current output-

{
    Browser {
        BrowserName: BrowserChrome,
        Version: {
            Major: 45,
            Minor: 0,
            Patch: 2454,
        },
    },
    OS {
        Platform: PlatformMac,
        Name: OSMacOSX,
        Version: {
            Major: 10,
            Minor: 10,
            Patch: 5,
        },
    },
    DeviceType: DeviceComputer,
}

Desired output-

{
    Browser {
        BrowserName: Chrome,
        Version: {
            Major: 45,
            Minor: 0,
            Patch: 2454,
        },
    },
    OS {
        Platform: Mac,
        Name: MacOSX,
        Version: {
            Major: 10,
            Minor: 10,
            Patch: 5,
        },
    },
    DeviceType: Computer,
}

"brew" platform type

opera/9.80 (brew; opera mini/5.0/27.2405; u; en) presto/2.8.119 240x320 samsung sch-u660

Google search console mobile useragent detected as computer

The google search console bot for mobile is detected as computer wich makes google believe the mobile sites are for PC.

func main() {

	useragent := "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.92 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

	ua := uasurfer.Parse(useragent)
	fmt.Printf("Useragent: %s\n Result: %+v\n", useragent, ua)
}

Output:

Useragent: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.92 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
 Result: &{Browser:{Name:BrowserGoogleBot Version:{Major:0 Minor:0 Patch:0}} OS:{Platform:PlatformBot Name:OSBot Version:{Major:6 Minor:0 Patch:1}} DeviceType:DeviceComputer}

Could be good to detect google bots as computer or mobile as it should be

Going to fork to modify this

report playstation version as os version

e.g. playstation 3 is v3, 4 is v4, etc.

consider doing with all major consoles

---- summary ----
total user agents evaluated: 57092
total with unknown attributes: 903 0.015816577
requests with no UA string: 57 0.0009973928
---- details ----
browser name unknown: 84 0.0014713095
browser version unknown: 160 0.0028024942
platform unknown: 20 0.00035031178
os name unknown: 25 0.0004378897
os version unknown: 831 0.014555454
device unknown: 9 0.0001576403

Proto consideration

Thank you for the library.

Consider generating device/os/browser types from a proto file. Proto enums are easily upgraded should new device/os/browser coverage be needed. They also present the advantage of being referenced by other definitions. Take google/protobuf/empty.proto for example. A simple empty proto easy to create but referenced by many. Or google/protobuf/timestamp.proto. It is commonly used because it ensures a timestamp with milliseconds and/or nanoseconds precision is parsed around between services needing it. It would be good to parse these device/os/browser types as types and not as strings. Users not needing the proto definitions can continue to use the library as usual.

types
   | -> device
         | -> device.go
         | -> device.proto
         | -> device.pb.go
   | -> os
   ...

device.go

type Device struct {
   device DeviceType // from proto
   version string // or struct with major(), minor(), patch()
}

device.proto

syntax = "proto3";
option go_package = "github.com/avct/uasurfer/types/device";
enum DeviceType {
 unknown = 0;
 ...
};

or Device type can be a message

syntax = "proto3";
option go_package = "github.com/avct/uasurfer/types/device";
message Device {
  DeviceType deviceType = 1;
  Version version = 2;
}

Handling for FB app UA strings

(null) [fban/fbios;fbav/61.0.0.53.158;fbbv/35251526;fbrv/0;fbdv/ipad4,1;fbmd/ipad;fbsn/iphone os;fbsv/9.3.2;fbss/2;fbcr/;fbid/tablet;fblc/en_us;fbop/5]

facebook app

The four-letter codes beginning with FB appear to be named properties:
[
FBAN/FBIOS;
FBAV/20.1.0.15.10;
FBBV/5758778;
FBDV/iPhone6,2;
FBMD/iPhone;
FBSN/iPhone OS;
FBSV/8.1.2;
FBSS/2;
FBCR/TELEGRL;
FBID/phone;
FBLC/da_DK;
FBOP/5
]

My guesses about what the property names mean:
FBAN: FaceBook Application Name
FBAV: FaceBook Application Version
FBBV: FaceBook Build Version
FBDV: FaceBook Device Version
FBMD: FaceBook Major Device
FBSN: FaceBook System Name
FBSV: FaceBook System Version
FBSS: FaceBook System Something :)
FBCR: FaceBook CarrieR
FBID: FaceBook Identity of Device
FBLC: FaceBook Language Code
FBOP: FaceBook Other Parameters? I've no idea - looks like it might be a decimalised bit mask

Possible values I've seen in some of the fields (those in parens are regex patterns):

FBAN: (FBIOS|FB4A|MessengerForiOS)
FBAV: { many, many... }
FBBV: { many, many... }
FBDV: (iPhone|iPad)[0-9],[0-9]
FBMD: (iPhone|iPad)
FBSN: (iPhone OS)
FBSV: { matches the OS version earlier in the useragent string }
FBSS: ([1-3])
FBCR: { Sprint, Verizon, , AT&T, Three, vodaAU, 3Austria, Telstra, TELIA, OPTUS, U.S.Cellular, TFW, OrangeFrance }
FBID: (phone|tablet)
FBLC: { any language code of the form en_US - I've seen many }
FBOP: (1|5)

Device and Platform unknown with this ua

	"Opera/9.80 (J2ME/MIDP; Opera Mini/4.5.40312/60.297; U; en) Presto/2.12.423 Version/12.16",
	"SAMSUNG-SM-B313E Opera/9.80 (J2ME/MIDP; Opera Mini/4.5.40318/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Opera/9.80 (MAUI Runtime; Opera Mini/4.4.39001/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)",
	"Opera/9.80 (J2ME/MIDP; Opera Mini/4.5.40380/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Opera/9.80 (J2ME/MIDP; Opera Mini/8.0.40325/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Opera/9.80 (J2ME/MIDP; Opera Mini/8.0.35626/60.297; U; en) Presto/2.12.423 Version/12.16",
	"SAMSUNG-SM-B350E Opera/9.80 (J2ME/MIDP; Opera Mini/8.0.40326/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Opera/9.80 (J2ME/MIDP; Opera Mini/4.1.15082/60.297; U; en) Presto/2.12.423 Version/12.16",
	"Opera/9.80 (SpreadTrum; Opera Mini/4.4.31492/60.297; U; en) Presto/2.12.423 Version/12.16",

Unknown OS and Device for this ua

Sometimes recognizes Safari as Opera

Hey there!
I've recently adopted your package into my project, and after running a lot of tests I've come to some User Agents where your package recognize them as opera while they are from safari:

UA:
"Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) OPiOS/8.0.0.78129 Mobile/11D201 Safari/9537.53 XrRhbohyvvUWmRtipEat388XI4n5O7"

UASURFER:
&{Browser:{Name:BrowserOpera Version:{Major:8 Minor:0 Patch:0}} OS:{Platform:PlatformiPhone Name:OSiOS Version:{Major:7 Minor:1 Patch:1}} DeviceType:DevicePhone}

UA2:
"Mozilla/5.0 (iPhone; CPU iPhone OS 10_1_1 like Mac OS X) AppleWebKit/602.2.14 (KHTML, like Gecko) OPiOS/16.0.8.121059 Mobile/14B100 Safari/9537.53"

UASURFER2:
&{Browser:{Name:BrowserOpera Version:{Major:16 Minor:0 Patch:8}} OS:{Platform:PlatformiPhone Name:OSiOS Version:{Major:10 Minor:1 Patch:1}} DeviceType:DevicePhone}

Thank you for all the hard work (:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.