Tumbe Group of International Journals

Full Text


AGRICULTURAL CROP YIELD PREDICTION AND EFFICIENT USE OF FERTILIZER USING MACHINE LEARNING

Pratik Tawde

Lecturer

Department of Electronics and Telecommunication

Vidyalankar Polytechnic, Wadala, Mumbai, INDIA

pratik.tawde@vpt.edu.in

Abstract:

India being an agricultural country, its economy predominantly depends on agriculture yield growth and allied agro industry products. In India, agriculture is largely influenced by rainwater which is highly unpredictable. Agriculture growth also depends on diverse soil parameters, namely Nitrogen, Phosphorus, Potassium, Crop rotation, Soil moisture, surface temperature and also on weather aspects which include temperature, rainfall, etc. India now is rapidly progressing towards technical development. Thus, technology will prove to be beneficial to agriculture which will increase crop productivity resulting in better yields to the farmer. The proposed project provides a solution for Smart Agriculture by monitoring the agricultural field which can assist the farmers in increasing productivity to a great extent. Weather forecast data obtained from IMD (Indian Metrological Department) such as temperature and rainfall and soil parameters repository gives insight into which crops are suitable to be cultivated in a particular area. This work presents a system, in form of an android based application, which uses data analytics techniques in order to predict the most profitable crop in the current weather and soil conditions. The proposed system will integrate the data obtained from repository, weather department and by applying machine learning algorithm: Multiple Linear Regression, a prediction of most suitable crops according to current environmental conditions is made. This provides a farmer with variety of options of crops that can be cultivated. Thus, the project develops a system by integrating data from various sources, data analytics, prediction analysis which can improve crop yield productivity and increase the profit margins of farmer helping them over a longer run.

Keywords: Data Analytics, Prediction, Machine learning, Multiple linear regression, android application.

I. Introduction

INDIA is a highly populated country and randomly change in the climatic conditions need to secure the world food resources. Framers face serious problems in drought conditions. Type of soil plays a major role in the crop yield. Suggesting the use of fertilizers may help the farmers to make the best decision for their cropping situation. The number of studies Information and Communication Technology (ICT) can be applied for prediction of crop yield. By the use of Data Mining, we can also predict the crop yield. By fully analyse the previous data we can suggest the farmer for a better crop for the better yield . For the better yield we need to consider soil type and soil fertility things and also one of the major factors rainfall and groundwater availability if it is dry land it is better to go for cash crops and if is wetland it is better to go for wheat and sugarcane. There are15 agro-climatic regions in India these regions are divided on the bases of a type of the land. Each agro climatic regions can grow some specific crops. Based on that we need to suggest the farmer that which crop is best among those crops which belong to those climatic regions. Achieving the maximum crop at minimum yield is the ultimate Aim of the project. Early detection of problems and management of those problems can help the farmers for better crop yield. Crop yield prediction is the important research which helps to secure food. For the better understanding of the crop yield, we need to study of the huge data with the help of machine learning algorithm so it will give the accurate yield for that crop and suggest the farmer for a better crop. Improving the quantity of the crop is the key goal of precision agriculture means obtaining a better understanding of the crop using the information technology methods. The main goal of precision agriculture is profitability and sustainability. From ancient times agriculture has become the backbone of our country. Nowadays climatic conditions vary very often. So, it is hard to grow crops by understanding weather conditions. We need to use some technology to find or understand the crop details and guide the farmers to grow crops accordingly and moreover fertilizer also one of the major factors to grow crops accordingly. If fertilizer is used more or less in the field the soil may lose it fertility and crop may not give the expected yield. so, fertilizer also becomes the major factor in it . mostly understanding the temperature conditions is much necessary for India because we can improve the Indian economy with the help of the crop prediction because it plays a major role in the Indian economy. Generally, machine learning algorithms will predict the most efficient output of the yield. Previously yield is predicted on the bases of the farmer’s prior experience but now weather conditions may change drastically so they cannot guess the yield so, technology can help them to predict the yield of the crop weather to go for that crop or no. machine learning model will understand the pattern of the crop and yield based on the several conditions and predicts the yield of the area in which he is going to crop. The challenge in it is to build the efficient model to predict the most efficient model to predict the output of the crop so try with the different algorithms and compare all the algorithms and which one has the less error and loss choose that model and predict the yield of that particular crop.

II. Motivation

Agriculture is the most important sector that influences the economy of India. It contributes to 18% of India's Gross Domestic Product (GDP) and gives employment to 50% of the population of India. People of India are practicing Agriculture for years but the results are never satisfying due to various factors that affect the crop yield. To fulfil the needs of around 1.2 billion people, it is very important to have a good yield of crops. Due to factors like soil type, precipitation, seed quality, lack of technical facilities etc. the crop yield is directly influenced. Hence, new technologies are necessary for satisfying the growing need and farmers must work smartly by opting new technologies rather than going for trivial methods

III. Problem statement

Early prediction of crop yield is important for planning and taking various policy decisions. Many countries use the conventional technique of data collection for crop monitoring and yield prediction based on ground based visits and reports. These methods are subjective, very costly and time consuming. The common problem in existing crop yield prediction methods are given below, The most important problem of existing crop yield prediction method is accuracy and time consuming problem. In existing time series crop yield prediction method does not react to variations that occur for cycles and seasonal effects. Needs extensive information to develop and test the model and also available information in agriculture is sparse and incomplete in existing simulation model. Limited studies have been made in crop yield prediction using existing decision tree technique. Prediction error value also important problem in crop yield prediction or estimation methods. These are the main drawbacks of various existing works, which motivate us to do this research on crop yield prediction.

III. System Design

Figure 1: Proposed System Design

 

IV. Proposed System Methodology

Linear Regression:

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable. This form of analysis estimates the coefficients of the linear equation, involving one or more independent variables that best predict the value of the dependent variable. Linear regression fits a straight line or surface that minimizes the discrepancies between predicted and actual output values. There are simple linear regression calculators that use a “least squares” method to discover the best-fit line for a set of paired data. You then estimate the value of X (dependent variable) from Y (independent variable).

Assumptions to be considered for success with linear-regression analysis:

For each variable: Consider the number of valid cases, mean and standard deviation.

For each model: Consider regression coefficients, correlation matrix, part and partial correlations, multiple R, R2, adjusted R2, change in R2, standard error of the estimate, analysis-of-variance table, predicted values and residuals. Also, consider 95-percent-confidence intervals for each regression coefficient, variance-covariance matrix, variance inflation factor, tolerance, Durbin-Watson test, distance measures (Mahalanobis, Cook and leverage values), DfBeta, DfFit, prediction intervals and case-wise diagnostic information.

Plots: Consider scatterplots, partial plots, histograms and normal probability plots.

 

Data: Dependent and independent variables should be quantitative. Categorical variables, such as religion, major field of study or region of residence, need to be recoded to binary (dummy) variables or other types of contrast variables. 

Other assumptions: For each value of the independent variable, the distribution of the dependent variable must be normal. The variance of the distribution of the dependent variable should be constant for all values of the independent variable. The relationship between the dependent variable and each independent variable should be linear and all observations should be independent.

Non Linear Regression: Nonlinear regression is a form of regression breakdown in which observational data are displayed by a function which is a nonlinear amalgamation of the model parameters and depends on one or more independent variables. The data is plotted by a technique of successive approximations.

Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic form, i.e.,  

In linear regression, we have f(x) = Wx + b; the parameters W and b must be fit to data. What nonlinear function do we choose? In principle, f(x) could be anything: it could involve linear functions, sines and cosines, summations, and so on. However, the form we choose will make a big difference on the effectiveness of the regression: a more general model will require more data to fit, and different models are more appropriate for different problems. Ideally, the form of the model would be matched exactly to the underlying phenomenon. If we’re modeling a linear process, we’d use a linear regression; if we were modeling a physical process, we could, in principle, model f(x) by the equations of physics. In many situations, we do not know much about the underlying nature of the process being modeled, or else modeling it precisely is too difficult. In these cases, we typically turn to a few models in machine learning that are widely-used and quite effective for many problems. These methods include basis function regression (including Radial Basis Functions), Artificial Neural Networks, and k-Nearest Neighbors. There is one other important choice to be made, namely, the choice of objective function for learning, or, equivalently, the underlying noise model. In this section we extend the LS estimators introduced in the previous chapter to include one or more terms to encourage smoothness in the estimated models. It is hoped that smoother models will tend to over fit the training data less and therefore generalize somewhat better.

 Multi-Linear Regression:

The difference between simple linear regression and multiple linear regression is that, multiple linear regression has (>1) independent variables, whereas simple linear regression has only 1 independent variable. In this project, Multiple Linear Regression algorithm is used to predict the crops. Multiple Regression is an extension of simple Linear Regression. It is used when we want to predict the value of a variable based on the value of two or more other variables. The variable we want to predict is called the dependent variable (or sometimes, the outcome, target or criterion variable). The variables we are using to predict the value of the dependent variable are called the independent variables (or sometimes, the predictor, explanatory or regressor variables). For example, Multiple Regression to understand whether exam performance can be predicted based on revision time, test anxiety, lecture attendance and gender. Multiple Regressions also allows you to determine the overall fit (variance) of the model and the relative contribution of each of the predictors to the total variance

Formulae:

A Linear Regression model that contains more than one predictor variable is called a Multiple Linear Regression model. The following model is A Multiple Linear Regression model with two predictor variables, ????1 and ????2 . ???? = ????0 + ????1????1 + ????2????2+ ∈ ……………. (2)

Where, 0,????1,????2 … are coefficients of Multiple Linear Regression ????????1,????????2 ... are independent variables.

The model is linear because it is linear in the parameters 0, ????1 and ????2. The model describes a plane in the three dimensional space of ????, ????1 ???????????? ????2 .

The parameter ????0 is the intercept of this plane. Parameters ????1 and ????2 are referred to as partial regression coefficients.

Parameter ????1represents the change in the mean response corresponding to a unit change in 1 when ????2 is held constant.

Parameter ????2 represents the change in the mean response corresponding to a unit change in ????2 when ????1 is held constant.

Figure 2: System design

Input: The prediction of crop is dependent on numerous factors such as Soil Nutrients, weather and past crop production in order to predict the crop accurately. All these factors are location reliant and thus the location of user is taken as an input to the system.

Data Acquisition: Depending on the current user location, the system mines the soil properties in the respective area from the soil repository. In a similar approach, weather parameters are extracted from the weather data set.

Data Processing: A crop can be cultivable only if apropos conditions are met. These include extensive parameters allied to soil and weather. These constraints are compared and the apt crops are ascertained. Multiple Linear Regression is used by the system to predict the crop. The prediction is based on past production data of crops i.e.: identifying the tangible weather and soil parameters and comparing it with current conditions which will predict the crop more accurately and in a practical manner.

Output: The most profitable crop is predicted by the system using Multiple Linear Regression algorithm and the user is provided with multiple suggestions of crop conferring to the duration of crop.

Set Theory: S = {I, Fm, O, S, F} I = {I1} …………….(3) set of Input.

I1 = Location of user

FM = {Get Location}

Get Attributes (latitude, longitude),

Get Soil,

Get Weather,

Feasible Crop (soil, weather),

Past Production 

Profitable Crop (Feasible Crops, Past Production)

Max Profitable Crops} ……………Set of functions.

Where, soil – N, P, K components weather – Temperature and Rainfall values

O = {Crop predicted for given Location}

Set of output.

S= Correct prediction for High production and profit

Success Condition F = Failure in prediction due to incorrect training data …………...Failure Condition

V. Mathematical Representation Of Algorithm for Proposed System

Train data: ????1 = ????0 + ????1????????2 +……. + ???????????????????? + ∈????

for i=1,2, …, n Where, ????0,????1,????2…are coefficients of Multiple

Linear Regression ????????1,????????2 ……..???????????? are independent variables.

X {weather attributes, soil attributes} Y{production}

???? = ???????? + ???? Y = Xβ + E Y- production matrix X- attributes matrix B- Partial coefficient matrix E- error control

???? = (X’X)-1 X’Y ………………Least Square Estimate X’ - Transpose X-1 -Inverse of Matrix Prediction: = X Result: res= ???? – ????

 

VI. Conclusion

The proposed system takes into consideration the data related to soil, weather and past year production and suggests which are the best profitable crops which can be cultivated in the apropos environmental condition. As the system lists out all possible crops, it helps the farmer in decision making of which crop to cultivate. Also, this system takes into consideration the past production of data which will help the farmer get insight into the demand and the cost of various crops in market. As maximum types of crops will be covered under this system, farmer may get to know about the crop which may never have been cultivated.

In the future, all farming devices can be connected over the internet using IOT. The sensors can be employed in farm which will collect the information about the current farm conditions and devices can increase the moisture, acidity, etc. accordingly. The vehicles used in farm like tractor will be connected to internet in future which will, in real time pass data to farmer about crop harvesting and the disease crops may be suffering from thus helping the farmer in taking appropriate action. Further the best profitable crop can also be found in light of the monetary and inflation ratio.

VII. Advantage

The obtained result will be helpful for the farmers to know the Yield of the crop so, he can go for the better crop which gives high yield and also say them the efficient use of fertilizer so that he can use only the required amount of fertilizers for that field. This way we can help the farmers to grow the crop which gives them better yield.

VIII. References

  1. https://en.wikipedia.org/wiki/Agriculture.
  2. https://en.wikipedia.org/wiki/Data_analysis.
  3. JeetendraShenoy, YogeshPingle, “IOT in agriculture”, 2016 IEEE.
  4. M.R. Bendre, R.C. Thool, V.R.Thool, “Big Data in Precision agriculture”, Sept 2015, NGCT.
  5. Monali Paul, Santosh K. Vishwakarma, Ashok Verma, “Analysis of Soil Behavior and Prediction of Crop Yield using Data Mining approach”, 2015 International Conference on Computational Intelligence and Communication Networks.
  6. Abdullah Na, William Isaac, Shashank Varshney, Ekram Khan, “An IoT Based System for Remote Monitoring of Soil Characteristics”, 2016 International Conference of Information Technology.
  7. Dr.N.Suma, Sandra Rhea Samson, S.Saranya, G.Shanmugapriya, R.Subhashri, “IOT Based Smart Agriculture Monitoring System”, Feb 2017 IJRITCC.
  8. N.Heemageetha, “A survey on Application of Data Mining Techniques to Analyze the soil for agricultural purpose”, 2016IEEE.
  9. DhivyaB ,Manjula , Siva Bharathi, Madhumathi, “A Survey on Crop Yield Prediction based on Agricultural Data”, International Conferencence in Modern Science and Engineering, March 2017.
  10. Giritharan Ravichandran, ,Koteeshwari R S “Agricultural Crop Predictor and Advisor using ANN for Smartphones”, 2016 IEEE,
  11. R.Nagini, Dr. T.V. Rajnikanth, B.V. Kiranmayee, “Agriculture Yield Prediction Using Predictive Analytic Techniques , 2nd International Conference on Contemporary Computing and Informatics (ic3i),2016
  12. Awanit Kumar, Shiv Kumar, “Prediction of production of crops using K-Means and Fuzzy Logic”, IJCSMC, 2015.
  13. Dr. Manjunath M, Dr. Dinesh S, Prof. Venkatesha G,  “Automatic Irrigation, Pesticide Sprinkling and Solar Operated Tractor” published in Journal of Advances in Communication Engineering and Its Innovations Volume 4|  Issue 2 | 09 July 2019 |Page 1-12 |MANTECH PUBLICATIONS, Ghaziabad 201014, Uttar Pradesh, INDIA 2019 , DOI: http://doi.org/10.5281/zenodo.3274814
  14. Dr. Manjunath M, Prof. Venkatesha G, Dr. Dinesh S,   “Survey Paper on Classifiers for Machine learning” published in Journal of Artificial Intelligence, Machine Learning and Soft Computing Volume 4|  Issue 2 | 14 Sep 2019 |Page 17-27 |MANTECH PUBLICATIONS, Ghaziabad 201014, Uttar Pradesh, INDIA 2019 ,                      DOI: http://doi.org/10.5281/zenodo.3407941.  
  15. https://en.wikipedia.org/wiki/Linear_regression
  16. https://en.wikipedia.org/wiki/Nonlinear_regression
  17. Dr Manjunath M, Venkatesha G, Dr Dinesh S, “Visual Display Matrix Computation Bases Smart Object Detector” was reviewed by experts in this research area and accepted by the board of ‘Blue Eyes Intelligence Engineering and Sciences Publication’ which was published in ‘International Journal of Emerging Science and Engineering (IJESE), Scopus Indexed journal, ISSN: 2319–6378 (Online), Volume-5, Issue-12, January 2019. Page No.: 7-10, The B | Impact Factor of IJESE is 5.02.
  18. Dr. Manjunath M, Dr. Dinesh S, Venkatesha G “SEPD Technique for Removal of Salt and Pepper Noise in Digital Images” was published in International Research Journal of Engineering and Technology (IRJET), e-ISSN: 2395-0056, p-ISSN: 2395-0072, Volume: 06 Issue: 03 | Mar 2019 | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal |Scopus Indexed Journal.


Sign In  /  Register

Most Downloaded Articles

Acquire employability in Indian Sinario

The Pink Sonnet

Department of Mathematics @ GFGC Tumkur

Knowledge and Education- At Conjecture

ಗ್ರಾಮೀಣ ಪ್ರದೇಶದಲ್ಲಿ ಆಯಗಾರಿಕೆ ಸಂಸ್ಕೃತಿ




© 2018. Tumbe International Journals . All Rights Reserved. Website Designed by ubiJournal