microsoft / product-recommendations Goto Github PK

View Code? Open in Web Editor NEW

240.0 44.0 128.0 41.41 MB

Product Recommendations solution

License: Other

C# 85.86% Batchfile 0.13% HTML 0.29% CSS 1.27% JavaScript 12.42% ASP.NET 0.02%

product-recommendations's People

Contributors

Stargazers

Watchers

Forkers

asthanarht cld0 dragomirclaudiu danieladolcan m-khosravi cjaliaga pcofre minameltem cloudbreadpapa flecoqui yueguoguo theperiod sqlalexai brennanpayne edgarate strategist922 anishsingh20 williamtran29 sagieran triadai hafeez3000 aiexperts armanrahman22 spotlabsnet djpirra seanw122 jaieparker closetoyou293 dciborow astd befirst-solutions agpalma59 eformedpartners deptinsights fkay henriquebp87 sergemx ashi1702 rafaelmd wdaniel1993 jqhuangonearth rishi8313 hongooi73 alu0100693737 tawanike fawcao dynamicdeploy leolorenzoluis juancarlosbaezpozos altaibaatar thinkgandhi lulzzz sriramsoftware magnuskarlssonhm ylashin expertsender-marcinsynak dariuszbz thara0402 nguyentankhtn tuanh118 vediev navitimejapan maiko-okawa computeworks brunohdossantos dlozanonavas nguyenducnhaty marcoschoma mrrajeshrai phamngocquy koshal antikytheraton 216giorgiy rajbagchi morganedellavalle charaflachachi todofactory djson mdriess messan mlnethub garryfrgx bidexbido schmidlin872 dennispgk bharathpalanivelu evsprathap yazici kostavlev cclee627 lambert764 santana-macharia gibe drobune xiahn bhaskers-blu-org2 net4u kans-alpha taffywrinkle claudiusgonzo

product-recommendations's Issues

Question - How can I integrate multiple solutions on one Azure Web App.

Hi.

I tried ARM deployment. It's so useful.
By the way, I'd like to integrate multiple solutions on one Azure WebApp, what should I do?

I manage 3 solutions and update each model everyday, these are running on each WebApp.
I'd like to integrate WebApp into one, because the price of WebApp is expensive.

I do not want to mix models in one solution, because these models are using in completely different use cases.

Do you have any ideas?

Thank you.

Substitute Products Support

I would like to know if it is possible to somehow have alternative products, which may be a better alternative, or have a higher markup, to come up in the results of the recommendation. These products may or may not be cold, but need to be prioritized over standard recommendations.

Is this possible somehow using features or would the code here have to be changed.

Diagonals in Item-Item Similarity Matrix Set to Zero

Hello,

I have been running some tests with a catalog of 50 unique items and a usage file of 200 unique users. After training the model, I can construct a 'user recommendation matrix' from the recommendations (k = 50) for each user. From this, and believing that SAR builds the user affinity matrix in the way that I think it does, I am able to fully reconstruct the item-item similarity matrix.

When I reconstruct the item-item similarity matrix, I find that ALL of the diagonal values are set to zero. This behavior is observed with cooccurrence, jaccard, and lift.

Please tell me if this is the desired behavior, and if so, why. I can not find any reason to set these diagonal elements (essentially a measure of an items frequency) to zero.

Thanks,
Ryan

Documentation error for default similarity function

In the docs, the default similarity function is stated to be Lift. As far as I can tell, it's actually Jaccard.

Deploy to Azure button results in 'DeploymentFailed' because ''remoteDebuggingVersion' has an invalid value'

Attempting to use the deploy to azure link with default values results in an error.

Steps to reproduce:

create new azure account
sign up for free trial, enter billing etc
use "deploy to azure" button at top of this page: https://github.com/microsoft/Product-Recommendations/tree/master/deploy
create a resource, accept the terms, and hit "Purchase"

Result is this error:

{ "code": "DeploymentFailed", "message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.", "details": [ { "code": "BadRequest", "message": "{\r\n \"Code\": \"BadRequest\",\r\n \"Message\": \"The parameter 'remoteDebuggingVersion' has an invalid value. Details: Supported Versions: VS2017,VS2019.\",\r\n \"Target\": null,\r\n \"Details\": [\r\n {\r\n \"Message\": \"The parameter 'remoteDebuggingVersion' has an invalid value. Details: Supported Versions: VS2017,VS2019.\"\r\n },\r\n {\r\n \"Code\": \"BadRequest\"\r\n },\r\n {\r\n \"ErrorEntity\": {\r\n \"ExtendedCode\": \"01033\",\r\n \"MessageTemplate\": \"The parameter '{0}' has an invalid value. Details: {1}.\",\r\n \"Parameters\": [\r\n \"remoteDebuggingVersion\",\r\n \"Supported Versions: VS2017,VS2019\"\r\n ],\r\n \"Code\": \"BadRequest\",\r\n \"Message\": \"The parameter 'remoteDebuggingVersion' has an invalid value. Details: Supported Versions: VS2017,VS2019.\"\r\n }\r\n }\r\n ],\r\n \"Innererror\": null\r\n}" } ] }

Not able to deploy the solution

{
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
"details": [
{
"code": "BadRequest",
"message": "{\r\n "Code": "BadRequest",\r\n "Message": "The host name cognitive_servicesahqibdfiimvcews.azurewebsites.net is invalid.",\r\n "Target": null,\r\n "Details": [\r\n {\r\n "Message": "The host name cognitive_servicesahqibdfiimvcews.azurewebsites.net is invalid."\r\n },\r\n {\r\n "Code": "BadRequest"\r\n },\r\n {\r\n "ErrorEntity": {\r\n "ExtendedCode": "04003",\r\n "MessageTemplate": "The host name {0} is invalid.",\r\n "Parameters": [\r\n "cognitive_servicesahqibdfiimvcews.azurewebsites.net"\r\n ],\r\n "Code": "BadRequest",\r\n "Message": "The host name cognitive_servicesahqibdfiimvcews.azurewebsites.net is invalid."\r\n }\r\n }\r\n ],\r\n "Innererror": null\r\n}"
}
]
}

It says "The host name cognitive_servicesahqibdfiimvcews.azurewebsites.net is invalid".
Pls help !!

Anyway to improve an existing model without recreating a new one? (Machine Learning)

Hi, is this solution able to improve an existing model without recreating a new model each time?

Re-Train a Model on New Usage/Catalog Data?

I was just wondering if the swagger PUT call to models was to retrain it? It appears to only exist for default, so it is perhaps to set it as default?

If not, what is the correct way to update a model?

I have searched all of the documentation for this answer, but sorry if I missed it.

Thank you,

Suggestion - Support for Azure Append Blob for catalog file

I am developing a script to add new products to my catalog file in my Azure blob container.
The blob type for my catalog file is an Append Blob, because in C# it supports appending new lines to an existing file. This is much simpler than using the standard block blob's write operation which overwrites the previous entries.

When I try to train my model with my catalog file as type append blob, I get the error
'System.Exception: Failed downloading training files from storage. Model id: 198d7cdf-818c-4fd5-a8b3-c8c636cdabd8 ---> Microsoft.WindowsAzure.Storage.StorageException: Blob type of the blob reference doesn't match blob type of the blob. ---> System.InvalidOperationException: Blob type of the blob reference doesn't match blob type of the blob.
at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.UpdateAfterFetchAttributes(BlobAttributes blobAttributes, HttpWebResponse response)
at Microsoft.WindowsAzure.Storage.Blob.CloudBlob

Is there a chance append blob's will be supported in the future?

catalog file schema best practice when the item has many categories

In catalog file schema ,,[,], I think the item category has only one value.
so what is the best practice when the item has many categories?

In ML.NET SAR namespace is missing

https://www.nuget.org/packages/Microsoft.MachineLearning.TLCRecommendations/ seems unlisted?
Also, Microsoft.MachineLearning.Recommend.Sar.Sar seems not available in ml.net (I am trying for .net6 )...
Any thoughts on where the above name space relocated in ml.net ?
Any guidance would be very much helpful.

Any Plans to use ML.Net to implement SAR

Could not load file or assembly 'ManagedBlingSigned' or one of its dependencies.

I have cloned this repo locally.
I created a Application Service in our Azure account, and set up my local solution to deploy to my new App Service. (In order to do this I needed to upgrade the WindowsAzure.Storage reference to 9.x; not sure if that's relevant.)
Now when I go to the app homepage, I get a Yellow Screen of Death:

[BadImageFormatException: Could not load file or assembly 'ManagedBlingSigned' or one of its dependencies. An attempt was made to load a program with an incorrect format.]
   System.Reflection.RuntimeAssembly._nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, IntPtr pPrivHostBinder, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks) +0
   System.Reflection.RuntimeAssembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, IntPtr pPrivHostBinder, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks) +36
   System.Reflection.RuntimeAssembly.InternalLoadAssemblyName(AssemblyName assemblyRef, Evidence assemblySecurity, RuntimeAssembly reqAssembly, StackCrawlMark& stackMark, IntPtr pPrivHostBinder, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks) +152
   System.Reflection.RuntimeAssembly.InternalLoad(String assemblyString, Evidence assemblySecurity, StackCrawlMark& stackMark, IntPtr pPrivHostBinder, Boolean forIntrospection) +77
   System.Reflection.RuntimeAssembly.InternalLoad(String assemblyString, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection) +21
   System.Reflection.Assembly.Load(String assemblyString) +28
   System.Web.Configuration.CompilationSection.LoadAssemblyHelper(String assemblyName, Boolean starDirective) +38

[ConfigurationErrorsException: Could not load file or assembly 'ManagedBlingSigned' or one of its dependencies. An attempt was made to load a program with an incorrect format.]
   System.Web.Configuration.CompilationSection.LoadAssemblyHelper(String assemblyName, Boolean starDirective) +738
   System.Web.Configuration.CompilationSection.LoadAllAssembliesFromAppDomainBinDirectory() +217
   System.Web.Configuration.CompilationSection.LoadAssembly(AssemblyInfo ai) +130
   System.Web.Compilation.BuildManager.GetReferencedAssemblies(CompilationSection compConfig) +170
   System.Web.Compilation.BuildManager.GetPreStartInitMethodsFromReferencedAssemblies() +92
   System.Web.Compilation.BuildManager.CallPreStartInitMethods(String preStartInitListPath, Boolean& isRefAssemblyLoaded) +290
   System.Web.Compilation.BuildManager.ExecutePreAppStart() +157
   System.Web.Hosting.HostingEnvironment.Initialize(ApplicationManager appManager, IApplicationHost appHost, IConfigMapPathFactory configMapPathFactory, HostingEnvironmentParameters hostingParameters, PolicyLevel policyLevel, Exception appDomainCreationException) +549

[HttpException (0x80004005): Could not load file or assembly 'ManagedBlingSigned' or one of its dependencies. An attempt was made to load a program with an incorrect format.]
   System.Web.HttpRuntime.FirstRequestInit(HttpContext context) +10075108
   System.Web.HttpRuntime.EnsureFirstRequestInit(HttpContext context) +95
   System.Web.HttpRuntime.ProcessRequestNotificationPrivate(IIS7WorkerRequest wr, HttpContext context) +254

Help, please?

Marketplace catalog file categories

Am I thinking this right with our id's (bold) and features for product catalog? Our categories have id's, like image below

I think I will make one model per market (Sweden, Germany, etc). They each have their own translations for what the categories are translated to from the SD1 etc naming.

AB2065138, Blue Casual Tight fit Medium Armani Blazer,SD1M5S6,, Context=Casual, Gender=Male, Brand=Armani, Article Standard Colour=Blue, Size=Medium, Seasonality=Summer, Fabric appearence=Velvet

Another one in clothing would be
2b324086-85ba-4d68-b83d-8dae68db39f0, White Loose fit Small Adidas T-shirt,SD1M10S1,, Context=Casual, Gender=Women, Brand=Adidas, Article Standard Colour=White, Size=Small, Seasonality=All year, Fabric appearence=Textile

But then how do we do Accessories? Is that a different model?
b68bd680-5cc2-4220-98b6-be743943006c, S.c Iphone X/xs Silicone Black,SD2M20S1,, Gender=Unisex, Brand=The Case Factory, Article Standard Colour=Black, Size=Small

Some product attributes does not exist for Accessories the same way they do for Clothing.

Failed model training without any details

Hi, I've been training the model succesfully with different sources of data (belonging to different clients) for the past weeks. But last week something strange happened.

On March 5th I successfully trained a model with 10.5 million lines in the usage files using a B3 instance (with 7GB of RAM). The model returns this info:

{
    "id": "xxxxxxxx",
    "description": "Client 1 recommendations model",
    "creationTime": "2018-03-05T14:26:26.3215001Z",
    "modelStatus": "Completed",
    "modelStatusMessage": "Model Training Completed Successfully",
    "parameters": {
        "blobContainerName": "data-client1",
        "catalogFileRelativePath": "3years/catalogue.csv",
        "usageRelativePath": "3years/usage/3",
        "supportThreshold": 4,
        "cooccurrenceUnit": "User",
        "similarityFunction": "Jaccard",
        "enableColdItemPlacement": true,
        "enableColdToColdRecommendations": true,
        "enableUserAffinity": true,
        "enableUserToItemRecommendations": true,
        "allowSeedItemsInRecommendations": true,
        "enableBackfilling": true,
        "decayPeriodInDays": 60
    },
    "statistics": {
        "totalDuration": "01:07:01.0929075",
        "trainingDuration": "00:57:34.3260296",
        "storingUserHistoryDuration": "00:57:13.0521714",
        "catalogParsing": {
            "duration": "00:00:13.3761182",
            "successfulLinesCount": 104175,
            "totalLinesCount": 104175
        },
        "usageEventsParsing": {
            "duration": "00:09:13.3899645",
            "errors": [
                {
                    "count": 36,
                    "error": "UnknownItemId",
                    "sample": {
                        "file": "3years/usage/usage_part00.csv",
                        "line": 142809
                    }
                }
            ],
            "successfulLinesCount": 10499964,
            "totalLinesCount": 10500000
        },
        "numberOfCatalogItems": 104175,
        "numberOfUsageItems": 72983,
        "numberOfUsers": 2967496,
        "catalogCoverage": 0.70058075353971683,
        "catalogFeatureWeights": {
            "1": -0.0009318547,
            "2": 0.01265376,
            "3": 0.04137097,
            "marca": 0.003024425
        }
    }
}

Since then I've been trying to train a model using the same parameters but with different data. This data belongs to the same client and both datasets are in fact a sample of the same larger data pool. The thing is that I can't seem to replicate the number of usage lines that worked in the previous example.

And even more, today I tried to replicate the model that successfully trained, but it came away with the same error. This is the output message when getting information about the model via its REST API:

{
    "id": "yyyyyyyyy",
    "description": "Client 1 recommendations 11 MM",
    "creationTime": "2018-03-12T12:13:12.571664Z",
    "modelStatus": "Failed",
    "modelStatusMessage": "Core Training",
    "parameters": {
        "blobContainerName": "data-client1",
        "catalogFileRelativePath": "3years/catalogue.csv",
        "usageRelativePath": "3years/usage/1",
        "supportThreshold": 4,
        "cooccurrenceUnit": "User",
        "similarityFunction": "Jaccard",
        "enableColdItemPlacement": true,
        "enableColdToColdRecommendations": true,
        "enableUserAffinity": true,
        "enableUserToItemRecommendations": true,
        "allowSeedItemsInRecommendations": true,
        "enableBackfilling": true,
        "decayPeriodInDays": 60
    }
}

What could be the problem here? It is the exact same configuration and data, but it fails!
Any help would be really appreciated

Taking lot of time when i try to train a model using rest api

Hi,

I am trying to train a model using rest api's call but its taking lot of time though size of usage file is 170 MB.

Request is something like this

headers
content-type : application/json
x-api-key:*******

Body
{
"description": "Test_RESTAPIBuild",
"blobContainerName": "input",
"catalogFileRelativePath": "Catalog/catalog_sample.csv",
"usageRelativePath": "Usage/usage_sample.csv",
"evaluationUsageRelativePath": "Usage",
"supportThreshold": 5,
"cooccurrenceUnit": "0",
"similarityFunction": "Jaccard",
"enableColdItemPlacement": true,
"enableColdToColdRecommendations": true,
"enableUserAffinity": true,
"enableUserToItemRecommendations": true,
"allowSeedItemsInRecommendations": true,
"enableBackfilling": true,
"decayPeriodInDays": 5
}

what exactly is issue ? I remember when I did through c# application it got trained in seconds.

Response I am getting is

{
"id": "XXXXXXX-d000-452a-947e-XXXXXXXXXXX",
"description": "Test_RESTAPIBuild",
"creationTime": "2018-01-10T10:31:36.0281983Z",
"modelStatus": "InProgress"
}

[ASK] Making Recommendations text in /doc/sar.md

Hi,

I'm not sure if I missed out any point but think below text in /doc/sar.md page needs some corrections/revisions:

################## SAR DOC
Note that the recommendation score of an item is purely based on its similarity to Item 5 in this case (??seems like duplication to some texts below). Assuming that a same item is not recommended again, items 1 and 4 have the highest score (Did you mean items 4 and 5? User already seen item 1) and would be recommended before items 2 and 3 (??).

Now, if this user adds Item 2 (Was this meant to be item 5?) to the shopping cart, affinity vector (assuming weight 2 (weight 1?) for this transaction) will be

New User aff
Item 1 0
Item 2 0
Item 3 0
Item 4 0
Item 5 1
resulting in recommendation scores:

New User rec
Item 1 2
Item 2 1
Item 3 1
Item 4 2
Item 5 3
Note that the recommendation score of an item is purely based on its similarity to Item 5 in this case. Assuming that a same item is not recommended again, items 1 and 4 have the highest score and would be recommended before items 2 and 3. Now, if this user adds Item 2 to the shopping cart, affinity vector (assuming weight 2 for this transaction) will be

New User aff
Item 1 0
Item 2 2
Item 3 0
Item 4 0
Item 5 1
resulting in recommendation scores:
################################

Thanks,

Evaluation process and results

For the evaluation, the recommendation is user based, right? Is it evaluated line by line in the evaluation file or first group the items by timestamp/user id and match with the recommendation results?
I have a dataset that has items fewer than 100, customers over 1M. I'm wondering whether I can access all the output for the evaluation file?
Thank you so much!

How to train the existing model

I am trying to train the existing model based on the usage file. There is no api call to train the existing model.....can you please suggest any thing

Incremental training of SAR model

Training over large data-sets takes quite a while, even on the highest deployment plan. Is there a way to incrementally train the SAR model as new usage data becomes available?

Getting unhandled exception with large volume of data

Hi,
I've got the system working with upto 7GB of training data. But as soon as I try 10GB the program falls over with the following after approx 4 hours.
Any ideas ?

Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.IO.IOException
at System.IO.MemoryStream.Write(Byte[], Int32, Int32)
at Microsoft.MachineLearning.Data.IO.BinarySaver.WriteWorker(System.IO.Stream, System.Collections.Concurrent.BlockingCollection`1, ColumnCodec[], Microsoft.MachineLearning.Data.ISchema, Int32, Microsoft.MachineLearning.IChannelProvider, Microsoft.MachineLearning.Internal.Utilities.ExceptionMarshaller)

Exception Info: System.InvalidOperationException
at Microsoft.MachineLearning.Internal.Utilities.ExceptionMarshaller.ThrowIfSet(Microsoft.MachineLearning.IExceptionContext)
at Microsoft.MachineLearning.Data.IO.BinarySaver.SaveData(System.IO.Stream, Microsoft.MachineLearning.Data.IDataView, Int32[])
at Microsoft.MachineLearning.Recommend.ItemSimilarity.SimilarityMatrix.Save(Microsoft.MachineLearning.Model.ModelSaveContext)
at Microsoft.MachineLearning.Model.ModelSaveContext.SaveModel[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](Microsoft.MachineLearning.Model.RepositoryWriter, System.__Canon, System.String)
at Microsoft.MachineLearning.Recommend.Sar.SarPredictor.Save(Microsoft.MachineLearning.Model.ModelSaveContext)
at Microsoft.MachineLearning.Model.ModelSaveContext.SaveModel[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](Microsoft.MachineLearning.Model.RepositoryWriter, System.__Canon, System.String)
at Microsoft.MachineLearning.Data.TrainUtils.SaveModel(Microsoft.MachineLearning.IHostEnvironment, Microsoft.MachineLearning.IChannel, System.IO.Stream, Microsoft.MachineLearning.IPredictor, Microsoft.MachineLearning.Data.RoleMappedData, System.String)
at Microsoft.MachineLearning.EntryPoints.PredictorModel.Save(Microsoft.MachineLearning.IHostEnvironment, System.IO.Stream)
at Recommendations.Core.Train.TrainedModel.GetObjectData(System.Runtime.Serialization.SerializationInfo, System.Runtime.Serialization.StreamingContext)
at System.Runtime.Serialization.Formatters.Binary.WriteObjectInfo.InitSerialize(System.Object, System.Runtime.Serialization.ISurrogateSelector, System.Runtime.Serialization.StreamingContext, System.Runtime.Serialization.Formatters.Binary.SerObjectInfoInit, System.Runtime.Serialization.IFormatterConverter, System.Runtime.Serialization.Formatters.Binary.ObjectWriter, System.Runtime.Serialization.SerializationBinder)
at System.Runtime.Serialization.Formatters.Binary.WriteObjectInfo.Serialize(System.Object, System.Runtime.Serialization.ISurrogateSelector, System.Runtime.Serialization.StreamingContext, System.Runtime.Serialization.Formatters.Binary.SerObjectInfoInit, System.Runtime.Serialization.IFormatterConverter, System.Runtime.Serialization.Formatters.Binary.ObjectWriter, System.Runtime.Serialization.SerializationBinder)
at System.Runtime.Serialization.Formatters.Binary.ObjectWriter.Serialize(System.Object, System.Runtime.Remoting.Messaging.Header[], System.Runtime.Serialization.Formatters.Binary.__BinaryWriter, Boolean)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Serialize(System.IO.Stream, System.Object, System.Runtime.Remoting.Messaging.Header[], Boolean)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Serialize(System.IO.Stream, System.Object)
at Recommendations.Common.ModelsProvider.SerializeTrainedModel(Recommendations.Core.ITrainedModel, System.IO.Stream, System.Guid)

Exception Info: System.Exception
at Recommendations.Common.ModelsProvider.SerializeTrainedModel(Recommendations.Core.ITrainedModel, System.IO.Stream, System.Guid)
at Recommendations.Common.ModelsProvider+d__1.MoveNext()
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)
at System.Runtime.CompilerServices.TaskAwaiter1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].GetResult() at NativeTrainer.WebJobLogic+<TrainModelAsync>d__3.MoveNext() at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) at System.Runtime.CompilerServices.TaskAwaiter1[[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].GetResult()
at NativeTrainer.WebJobLogic+d__1.MoveNext()

Exception Info: System.AggregateException
at NativeTrainer.Program.Main(System.String[])

TLCRecommendations.dll in nuget is delay signed.

TLCRecommendations.dll in "Microsoft.MachineLearning.TLCRecommendations:"3.8.51.1475" nuget consumed by this template project is delay signed. It won't load unless we skip the strong name validation for Microsoft public used for partially signing.

Change data type of User and product id

Can I change User Id from int to string? We have a system keeping track of users today but the identifier is of type string.
Same with Product. Our identifier of a product is a guid, is it possible to change the int productId to a string and how would I go about to do that?

Use existing App service plan

I already have existing Resources groups, and existing one Azure App Service.
I want to deploy this solution and connect to existing Azure services (existing resource group, Existing App Service, existing storage account). But this creates a new blank. I am not able to change Web App to connect to existing Azure App Service later, (same region etc.) because of new limitation.

please make it easier to deploy separately, and to connect to existing azure resources, rather then creating a new ones.

Filtering and boosting recommendations

Hi,

Are there any ways to filter and boost recommendations according to particular properties of items such as view count, non-linear boundaried genres etc? Also, i would like to know the future roadmap of this project as envisioned internally in Microsoft.

Thank you.

Diversity evaluation

I have put a usage file for evaluation, which contains 500,000 lines. On diversity evaluation scores, I see that only a total of 160,000 items were recommended (600 unique items). Why the total number is fewer than the lines in the evaluation file? I thought we should recommend several items for each line in the usage file? or this is by customer?

Not clear about what score means

I am creating the model using a dateset having 9K transaction records, with almost 8k unique users, and total
number of products is. 1100

I was getting a very low score for recommended items. And, it is understandable as very low co-occurence
See below:

Then, When I change the similarity function from "Jaccard" to "Cooccurrence" then my score is just bumped up.

See below:

Does anyone know why ?

Deploy on on-premise servers

Can I deploy this solution on an on-premise server, not on azure cloud?
How can I do that?

Thanks

.NET Core version?

Is there a chance this solution will ever be rewritten to .NET Core?

This would open new opportunities of hosting it in Azure.

Alternatively - perhaps there is some new similar project in .NET Core worth considering?

Test Personalized Recommendations Get Recommendation by user ID always return Score 0

I trained a model with the sample data provided from the repoistory.

I tried to test around with the result. Everything seems fine but when i enter user id it always ended up suggesting same items with the score of 0.

Can anyone figure it out what is happening?

Question - Specific User & Item combination score?

Hi All!

I am trying to find a solution for the following:

I have transactional data connected to users (loyalty card) from supermarkets. Roughly 30.000 items and 100.000 users with over a million transactional data.
Now, to train the model and get random recommendations is not a problem, the solution is great for that, but I cannot get my head around how to get individual scores for user & item combinations.

Basically, I have a weekly catalogue of around 50 items on sale. I would like to display these "coupons" in descending order in our mobile application where the user is signed in, based on his Recommendations Score for each item.

Is there a way to input user id & product id and get the score?

Thanks in advance!
Mark

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Recommendation base on User product category

Is there any way to ask for recommendations for first time user who has express interest in a specific category?

Input data validation tool

We recently tried to deploy this solution with a partner, using the sample C# console app client.

We faced several issues massaging the partner's data into the correct format for the SAR service and found the error messages from the service not extremely helpful at tracking down data issues. For example, the input data looks like CSV but doesn't seem to support standard CSV-encoded commas in columns.

As such, it would be super valuable to have a little tool that verifies that the input data is in the correct format for the SAR service so that data issues can be precisely located and fixed instead of trying to run the model and having to wait for a while until the bad data is hit which greatly slows down iteration speed.

Is this the Brussels Sprouts booth at the fun fair?

There appears to be zero traffic on this repository, nobody posting bugs, nobody asking or answering questions.
Does anyone in the real world use this product?
Or has Microsoft abandoned this field to AWS and Google?

Calculating co-occurrences

How do you define the co-occurrences of items? The items with the same timestamps in the usage file are identified in the same transaction, or the within certain time range?

Get Item-To-Item Recommendations Filtering by recommendedItemId match

Hi,

Is there to filter out recomendedItemId field by some constraint?

for example.
only return recommendedItemId results ending with character w
&recommendedItemId=.*w$

https://recommendationstoragews.azurewebsites.net/api/models/default/recommend?userId=00000000-0000-0000-0000-000000000000&recommendationCount=10&recommendedItemId=.*w$

NodeJS Implementaion

Hi,

Let me know if you have NodeJS implementation of this sample program. Also want to know the availability of any libraries/documentation in NodeJS for the Product Recommendation service.

Scores for Recommendations are returning 0's

Am always getting 0 for the scores. Any reason why?

Training of model

I tried to use the catalog.xlsx file and the three csv's with timestamps in one folder but got errors. Catalog

Usage

Postman request

Postman response
{
"id": "922792be-1add-485f-af51-07513ad8cb88",
"description": "Simple recommendations model",
"creationTime": "2020-07-04T21:43:03.9316674Z",
"modelStatus": "Failed",
"modelStatusMessage": "Failed to parse catalog file or parsing found no valid items",
"parameters": {
"blobContainerName": "trainingdata",
"catalogFileRelativePath": "catalogs/catalog.xlsx",
"usageRelativePath": "usage2",
"supportThreshold": 6,
"cooccurrenceUnit": "User",
"similarityFunction": "Jaccard",
"enableColdItemPlacement": true,
"enableColdToColdRecommendations": false,
"enableUserAffinity": true,
"enableUserToItemRecommendations": false,
"allowSeedItemsInRecommendations": true,
"enableBackfilling": true,
"decayPeriodInDays": 30
},
"statistics": {
"totalDuration": "00:00:00",
"trainingDuration": "00:00:00",
"catalogParsing": {
"duration": "00:00:00.0295404",
"errors": [
{
"count": 65,
"error": "MissingFields",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 1
}
},
{
"count": 34,
"error": "IllegalCharactersInItemId",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 5
}
},
{
"count": 1,
"error": "ItemIdTooLong",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 14
}
},
{
"count": 1,
"error": "MalformedLine",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 53
}
}
],
"successfulLinesCount": 0,
"totalLinesCount": 101
},
"numberOfCatalogItems": 0,
"numberOfUsageItems": 0,
"numberOfUsers": 0
}
}

Then I tried the other sample data.
Postman request

This succeeded to train but..
When asking for recommendations I get a bit too high numbers, something seems wrong here. What is the latest tested files and what setup is used for testing?
/api/models/fa2dd4a3-c783-4c0f-8a45-d801a2ee746b/recommend?itemId=2005018

Co-Occurence Unit Terminology

Hi I read the ModelTrainingParameters.cs about the parameters setup and definetion.

Out of my limited study, I tend to think or suggest that the Co-Occurrence unit is more referring to a levelling:

User Level - measure co-occurence of items in same user (regardless of time/date)
TimeDate Level - on top of same user, it further measure whether it is same date/time (for the same user)

If my understanding, based on the /// comments as given, then the wording or term "level" would be more appropriate, as "Co-occurence Unit" as standalone may refer to:

User - items bought by same user (regardless of time/date)
TimeStamp - items bought within the same time stamp (mostly it is by the same user, but if there are multiple check out machine / POS.... they may come from different users

So there would be a subtle / potential difference and this also puzzle beginners to this recommender system (esp the use of term here slightly deviate from literature elsewhere (co-occurrence refer more to products/items matrix or dimension *.

*It may be such as case at the implementation internal of the software that such as matrix is established (Product Co-Ocurrence Matrix for same user or same time stamp), however the term presented as such cause confusion for serious learner.

Hence the wording: Co-occurrence Level or Focus is suggested (especially the word Unit may also refer to quantity).

Character restrictions for item ID

We implement product recommendations solutions based on this project on many websites for our customers.

We've encountered websites which use different formats as their item IDs. Some use simple numbers but these strings could also be valid item IDs in some systems:

J4Z18-JSOM100-27M+31S+36S
[S4Z17-TSDLF701] TSDLF701 - kobalt
g.2017.12.x.xmas-gift-1
head&shoulders_2014_M7_G1
Charmine Rose_2015_M10_G2

It's not up to us to decide whether these formats are good or not - it's just how the real world looks like.

Unfortunately, it is impossible to use these strings as item identifiers in Product-Recommendations, because it imposes character restrictions on item ID in catalog file. Allowed characters are letters, numbers. dash and underscore. There is a piece of code in CatalogFileParser.cs that makes the check:

// check for illegal characters in the item id
if (!itemId.All(UsageEventsFilesParser.IsAlphanumericDashOrUnderscoreCharacter))
{
    parsingError = ParsingErrorReason.IllegalCharactersInItemId;
    return null;
}

The same check is performed when parsing usage events file.

My question is: is it really necessary to be so restrictive when it comes to product ID? Why not just allow any string? We've seen all kinds of special characters, spaces and even national characters in IDs.

Right now we would need to create some kind of ID translator and meticulously convert IDs back and forth every time we feed or retrieve data from Product-Recommendations.

Question: is Product Recommendations a fit for us?

Hi,
I'm the system architect for a company that produces a literacy product for students reading in schools. The product is basically an e-reader, in which the students can select and read books in class, and we've basically gamified the experience while collecting a ton of data points.
The problem we're trying to solve now is that students are spending too much time browsing through the library of available books; we want to minimize their browsing time and maximize their reading time.
So we figure Product Recommendations might be the way to go. A few questions, though:

Not all students have access to the same libraries. Even within the same school, some students might not have rights to books that other students can see. Does Product Recommendations have the facility to limit recommendations to the books to which they have access?
Does Product Recommendations allow including factors like "Trending", i.e. what other users are currently reading? We do want to show books that the student is likely to be interested in, because of topic, genre, author, etc., but we also want to expose them to other books that are popular, new releases, etc.
Can we factor in other users' ratings, both positive and negative? We have a lot of data of which books our users like and don't like, and that could be a factor in the recommendations.
We also have a metric called Lexile, which is a measure of reading level. We try to encourage students to read books with a Lexile rating within 200 points of their own Lexile score. Can we incorporate this into the algorithm?

TIA for your help!