JAVA_协同过滤算法商品推荐

2023年3月15日 190次阅读来源: Anonymity_Y

协同过滤算法实现步骤

1.表示用户行为矩阵，即统计用户购买某种商品类型的数量

 public double[] getNumByCustomer(Customer customer){
        List<OrderItem> list =orderItemDao.findByCustomerAndAliveAndState(customer.getId(),1,2);
        double [] vectore =new double[totalNum];
        int index=0;
        for(ProductType type:productTypes){
            for(OrderItem orderItem:list){
                if(orderItem.getProduct().getProductType().id==type.id){
                    vectore[index]=vectore[index]+orderItem.getNum();
                }
            }
        return vectore;
    }

2.用余弦距离计算每个用户与其它用户的行为相似度
下面代码是两个用户之间的相似度，进行遍历就可以获取全部相似度

 public double countSimilarity(double [] a,double [] b){
        double total=0;
        double alength=0;
        double blength=0;
        for(int i=0;i<a.length;i++){
            total=total+a[i]*b[i];
            alength=alength+a[i]*a[i];
            blength=blength+b[i]*b[i];
        }
        double down=Math.sqrt(alength)*Math.sqrt(blength);
        double result=0;
        if(down!=0){
            result =total/down;
        }
        return result;
    }

3.取相似度最高的前n个用户，组成相似用户集合
对Map按值进行排序

 public List<Map.Entry<Long,Double>> getMaxSimilarity(Customer customer){
        Map<Long,Double> result =new HashMap<Long,Double>();
        double vector[] =(double [])users.get(customer.getId());
        for(Map.Entry<Long,Object> entry:users.entrySet()){
            if(entry.getKey()!=customer.getId()){
                double [] temp =(double[])entry.getValue();
                double similarity =countSimilarity(temp,vector);              result.put(entry.getKey(),similarity);
            }
        }
        List<Map.Entry<Long,Double>> list = new LinkedList<Map.Entry<Long,Double>>( result.entrySet() );
        Collections.sort( list, new Comparator<Map.Entry<Long,Double>>(){
            public int compare( Map.Entry<Long,Double> o1, Map.Entry<Long,Double> o2 )
            {
                return (o2.getValue()).compareTo( o1.getValue() );
            }
        } );
        return list;
    }

4.获得相似用户集合购买的商品，并统计相似用户购买的商品的数量，进行排序

  public Map<Long,ProductNumModel> getProducts(List<Map.Entry<Long,Double>> list){
        List<Customer> simCustomers =new ArrayList<Customer>();
        System.out.println("相似度高的3个用户 ");
        for(int i=0;i<list.size()&&i<3;i++){
            Long id =list.get(i).getKey();
            Customer customer =customerDao.findByIdAndAlive(id,1);
            simCustomers.add(customer);
        }
        Map<Long,ProductNumModel> map =new HashMap<Long,ProductNumModel>();
        for(Customer customer:simCustomers){
           Map<Long,ProductNumModel> hashSet =getCustomerProduct(customer);
           for(Map.Entry<Long,ProductNumModel> entry:hashSet.entrySet()){
                ProductNumModel model=null;
                if(map.containsKey(entry.getKey())){
                    model=map.get(entry.getKey());
                    model.num+=entry.getValue().num;
                }else{
                    model=new ProductNumModel();
                    model.product=entry.getValue().product;
                    model.num=entry.getValue().num;
                }
                map.put(entry.getKey(),model);
            }
        }
        return map;
    }

总的调用函数，将前面函数连接，并把结果存到文件中。如果文件不存在，则用算法计算，如果文件内容存在，则直接读取文件的内容。开定时任务，每天或者一周将商品推荐文件删除，则会自动更新商品推荐内容

   public Map<String,Object> getAllSimilarity(Customer customer) throws IOException {
        changeCustomerToVector();
        for(Map.Entry<Long,Object> entry:users.entrySet()){
            double [] temp=(double [])entry.getValue();
        }
        InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("cxtx.properties");
        Properties p = new Properties();
        try {
            p.load(inputStream);
        } catch (IOException e1) {
            e1.printStackTrace();
        }
        String folderPath = p.getProperty("recommendFile");
        File file=new File(folderPath);
        if(!file.exists()){
            file.createNewFile();
        }
        FileInputStream fileInputStream=new FileInputStream(file);
        Map<String,Object> map =new HashMap<String,Object>();
        com.alibaba.fastjson.JSONObject jsonObject = null;
        try {
            if(fileInputStream!=null){
                jsonObject = com.alibaba.fastjson.JSON.parseObject(IOUtils.toString(fileInputStream, "UTF-8"));
            }
        } catch (IOException e) {
            map.put("msg","JSON 格式不正确");
            map.put("content","");
            return map;
        }
         Object content=null;
        if(jsonObject==null){ //如果文件中没有,则计算每个用户的推荐产品
            FileWriter fileWriter=new FileWriter(file,true);
            BufferedWriter bufferedWriter=new BufferedWriter(fileWriter);
            Map<Long,Object> temp =new HashMap<Long,Object>();
            for(Customer c:customers){
               List<Map.Entry<Long,Double>> list =this.getMaxSimilarity(c);
               Map<Long,ProductNumModel> result =getProducts(list);
               List<Product> list1=sortProduct(result);
               temp.put(c.getId(),list1);
            }
               JSONObject object=new JSONObject(temp);
               bufferedWriter.write(object.toString());
               bufferedWriter.flush();
            if(object!=null){
                content= object.get(customer.getId()+"");
            }
        }else{
            if(null!=jsonObject.get(customer.getId()+"")){
                content=jsonObject.get(customer.getId()+"");
            }
        }
        map.put("msg","获取成功");
        map.put("content",content);
        return map;
    }

注意的地方：

1.用户相似度计算时，要考虑分母为0的情况；同时要防止数值太大，超过了double能表示的范围，可以做一些处理，例如除以最大的某个商品销售量，来表示某个维度的向量值，或者减去某个值等等

2.余弦值越接近1，表明两个向量越相似，即计算出来的值越大，用户行为越相似

3.最后获得推荐的商品数量可以较多或较少，要根据一定策略进行排序，例如相似用户的购买数量，而不是商品总的销售量，因为不相似用户的数据，容易产生干扰。

    原文作者：Anonymity_Y
    原文地址: https://blog.csdn.net/u011376686/article/details/54287594
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。