{"id":1113,"date":"2023-10-27T12:20:54","date_gmt":"2023-10-27T17:20:54","guid":{"rendered":"http:\/\/my.dev.vanderbilt.edu\/douglasfisher\/?page_id=1113"},"modified":"2023-10-27T12:23:51","modified_gmt":"2023-10-27T17:23:51","slug":"backpropagation","status":"publish","type":"page","link":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/backpropagation\/","title":{"rendered":"Backpropagation"},"content":{"rendered":"<p>Doug<br \/>\nPlease give me code for the backpropagation learning algorithm using a sigmoid activation. function, with one output unit, three input units, and one hidden layer of three hidden units. Give the code in C<\/p>\n<p>ChatGPT<br \/>\nCertainly, here&#8217;s a simplified implementation of the backpropagation learning algorithm with a sigmoid activation function for a neural network with one output unit, three input units, and one hidden layer with three hidden units in C. Please note that this is a basic example and doesn&#8217;t include all the optimizations and best practices you would find in a production-ready deep learning library.<\/p>\n<pre><code class=\"!whitespace-pre hljs language-c\"><span class=\"hljs-meta\">#<span class=\"hljs-keyword\">include<\/span> <span class=\"hljs-string\">&lt;stdio.h&gt;<\/span><\/span>\r\n<span class=\"hljs-meta\">#<span class=\"hljs-keyword\">include<\/span> <span class=\"hljs-string\">&lt;math.h&gt;<\/span><\/span>\r\n\r\n<span class=\"hljs-comment\">\/\/ Sigmoid activation function<\/span>\r\n<span class=\"hljs-type\">double<\/span> <span class=\"hljs-title function_\">sigmoid<\/span><span class=\"hljs-params\">(<span class=\"hljs-type\">double<\/span> x)<\/span> {\r\n    <span class=\"hljs-keyword\">return<\/span> <span class=\"hljs-number\">1.0<\/span> \/ (<span class=\"hljs-number\">1.0<\/span> + <span class=\"hljs-built_in\">exp<\/span>(-x));\r\n}\r\n\r\n<span class=\"hljs-comment\">\/\/ Derivative of the sigmoid function<\/span>\r\n<span class=\"hljs-type\">double<\/span> <span class=\"hljs-title function_\">sigmoid_derivative<\/span><span class=\"hljs-params\">(<span class=\"hljs-type\">double<\/span> x)<\/span> {\r\n    <span class=\"hljs-keyword\">return<\/span> x * (<span class=\"hljs-number\">1.0<\/span> - x);\r\n}\r\n\r\n<span class=\"hljs-type\">int<\/span> <span class=\"hljs-title function_\">main<\/span><span class=\"hljs-params\">()<\/span> {\r\n    <span class=\"hljs-comment\">\/\/ Define network parameters<\/span>\r\n    <span class=\"hljs-type\">double<\/span> input[<span class=\"hljs-number\">3<\/span>] = {<span class=\"hljs-number\">0.1<\/span>, <span class=\"hljs-number\">0.2<\/span>, <span class=\"hljs-number\">0.3<\/span>};\r\n    <span class=\"hljs-type\">double<\/span> target_output = <span class=\"hljs-number\">0.9<\/span>;\r\n    <span class=\"hljs-type\">double<\/span> learning_rate = <span class=\"hljs-number\">0.1<\/span>;\r\n    \r\n    <span class=\"hljs-comment\">\/\/ Initialize weights and biases<\/span>\r\n    <span class=\"hljs-type\">double<\/span> weights_input_hidden[<span class=\"hljs-number\">3<\/span>][<span class=\"hljs-number\">3<\/span>];  <span class=\"hljs-comment\">\/\/ Weights from input to hidden layer<\/span>\r\n    <span class=\"hljs-type\">double<\/span> weights_hidden_output[<span class=\"hljs-number\">3<\/span>];     <span class=\"hljs-comment\">\/\/ Weights from hidden to output<\/span>\r\n    <span class=\"hljs-type\">double<\/span> bias_hidden[<span class=\"hljs-number\">3<\/span>];               <span class=\"hljs-comment\">\/\/ Biases for hidden layer<\/span>\r\n    <span class=\"hljs-type\">double<\/span> bias_output;                  <span class=\"hljs-comment\">\/\/ Bias for output layer<\/span>\r\n    \r\n    <span class=\"hljs-comment\">\/\/ Initialize weights and biases with random values or a specific method<\/span>\r\n    \r\n    <span class=\"hljs-comment\">\/\/ Training loop<\/span>\r\n    <span class=\"hljs-type\">int<\/span> epochs = <span class=\"hljs-number\">1000<\/span>;\r\n    \r\n    <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> epoch = <span class=\"hljs-number\">0<\/span>; epoch &lt; epochs; epoch++) {\r\n        <span class=\"hljs-comment\">\/\/ Forward pass<\/span>\r\n        <span class=\"hljs-type\">double<\/span> hidden[<span class=\"hljs-number\">3<\/span>];  <span class=\"hljs-comment\">\/\/ Hidden layer activations<\/span>\r\n        <span class=\"hljs-type\">double<\/span> output;     <span class=\"hljs-comment\">\/\/ Output activation<\/span>\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Compute activations for the hidden layer<\/span>\r\n        <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> i = <span class=\"hljs-number\">0<\/span>; i &lt; <span class=\"hljs-number\">3<\/span>; i++) {\r\n            hidden[i] = sigmoid(\r\n                input[<span class=\"hljs-number\">0<\/span>] * weights_input_hidden[<span class=\"hljs-number\">0<\/span>][i] +\r\n                input[<span class=\"hljs-number\">1<\/span>] * weights_input_hidden[<span class=\"hljs-number\">1<\/span>][i] +\r\n                input[<span class=\"hljs-number\">2<\/span>] * weights_input_hidden[<span class=\"hljs-number\">2<\/span>][i] + bias_hidden[i]\r\n            );\r\n        }\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Compute the output<\/span>\r\n        output = sigmoid(\r\n            hidden[<span class=\"hljs-number\">0<\/span>] * weights_hidden_output[<span class=\"hljs-number\">0<\/span>] +\r\n            hidden[<span class=\"hljs-number\">1<\/span>] * weights_hidden_output[<span class=\"hljs-number\">1<\/span>] +\r\n            hidden[<span class=\"hljs-number\">2<\/span>] * weights_hidden_output[<span class=\"hljs-number\">2<\/span>] + bias_output\r\n        );\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Calculate the error<\/span>\r\n        <span class=\"hljs-type\">double<\/span> error = <span class=\"hljs-number\">0.5<\/span> * (target_output - output) * (target_output - output);\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Backpropagation<\/span>\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Compute the delta for the output layer<\/span>\r\n        <span class=\"hljs-type\">double<\/span> delta_output = (target_output - output) * sigmoid_derivative(output);\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Update weights and biases for the output layer<\/span>\r\n        <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> i = <span class=\"hljs-number\">0<\/span>; i &lt; <span class=\"hljs-number\">3<\/span>; i++) {\r\n            weights_hidden_output[i] += learning_rate * delta_output * hidden[i];\r\n        }\r\n        bias_output += learning_rate * delta_output;\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Compute the deltas for the hidden layer<\/span>\r\n        <span class=\"hljs-type\">double<\/span> delta_hidden[<span class=\"hljs-number\">3<\/span>];\r\n        <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> i = <span class=\"hljs-number\">0<\/span>; i &lt; <span class=\"hljs-number\">3<\/span>; i++) {\r\n            delta_hidden[i] = delta_output * weights_hidden_output[i] * sigmoid_derivative(hidden[i]);\r\n        }\r\n        \r\n        <span class=\"hljs-comment\">\/\/ Update weights and biases for the hidden layer<\/span>\r\n        <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> i = <span class=\"hljs-number\">0<\/span>; i &lt; <span class=\"hljs-number\">3<\/span>; i++) {\r\n            <span class=\"hljs-keyword\">for<\/span> (<span class=\"hljs-type\">int<\/span> j = <span class=\"hljs-number\">0<\/span>; j &lt; <span class=\"hljs-number\">3<\/span>; j++) {\r\n                weights_input_hidden[j][i] += learning_rate * delta_hidden[i] * input[j];\r\n            }\r\n            bias_hidden[i] += learning_rate * delta_hidden[i];\r\n        }\r\n    }\r\n    \r\n    <span class=\"hljs-comment\">\/\/ Training is complete<\/span>\r\n    \r\n    <span class=\"hljs-comment\">\/\/ You can now use the trained network for making predictions<\/span>\r\n    \r\n    <span class=\"hljs-keyword\">return<\/span> <span class=\"hljs-number\">0<\/span>;\r\n}\r\n<\/code><\/pre>\n<p>This is a basic example to illustrate the backpropagation algorithm with sigmoid activation. In practice, you&#8217;d use libraries or frameworks that are optimized for neural network training, as they provide various features for scalability and performance.<\/p>\n<p>Doug<br \/>\nWhy did you initialize the inputs 0.1, 0.2, and 0.3 respectively?<\/p>\n<p>ChatGPT<br \/>\nI apologize for the confusion in the initial code. Initializing the input values as 0.1, 0.2, and 0.3 was arbitrary and not based on any specific real-world data or task. In practice, the input values should be determined by the specific problem you are trying to solve with your neural network.<\/p>\n<p>Typically, in a real-world scenario, you would have a dataset with input features that represent the characteristics of your data. The initialization of input values would depend on the data you are working with. These input values should be extracted from your dataset or provided based on the problem you are trying to solve.<\/p>\n<p>So, please consider the values 0.1, 0.2, and 0.3 for the input units as a placeholder. In practice, you should replace these values with actual data that corresponds to your problem domain.<\/p>\n<p>Doug<br \/>\nThank you<\/p>\n<p>ChatGPT<br \/>\nYou&#8217;re welcome! If you have any more questions or need further assistance, please don&#8217;t hesitate to ask. Good luck with your programming and learning!<\/p>\n<p>Doug<br \/>\nWhy is the output error calculated as 0.5 times the square of the difference between target and actual output activation. Does this have to do with the loss function you chose or is it arbitrary?<\/p>\n<p>ChatGPT<br \/>\nThe choice of calculating the output error as 0.5\u00d7(target\u2212output)20.5\u00d7(target\u2212output)2 is not arbitrary; it is related to the loss function commonly used in regression problems, specifically the mean squared error (MSE) loss function.<\/p>\n<p>The MSE loss function is a widely used choice when the goal is to minimize the average squared difference between the predicted output and the target values. It is particularly suitable for regression tasks, where you aim to predict continuous numeric values. The choice of 0.50.5 is included for mathematical convenience because when you take the derivative with respect to the output during backpropagation, it simplifies the computation.<\/p>\n<p>In the context of backpropagation and gradient descent, you are looking for the minimum of this loss function, and the derivative with respect to the output is used to guide weight updates. By using 0.50.5 as a scaling factor, the derivative simplifies to (output\u2212target)(output\u2212target), which is more convenient for updating weights.<\/p>\n<p>The choice of loss function depends on the nature of your problem. For classification tasks, you might use different loss functions, such as cross-entropy, which are more suitable for measuring the difference between predicted class probabilities and actual class labels. So, the selection of the loss function is not arbitrary and depends on the problem you are trying to solve.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Doug Please give me code for the backpropagation learning algorithm using a sigmoid activation. function, with one output unit, three input units, and one hidden layer of three hidden units. Give the code in C ChatGPT Certainly, here&#8217;s a simplified &hellip; <a href=\"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/backpropagation\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":633,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1113","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/pages\/1113","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/users\/633"}],"replies":[{"embeddable":true,"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/comments?post=1113"}],"version-history":[{"count":4,"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/pages\/1113\/revisions"}],"predecessor-version":[{"id":1117,"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/pages\/1113\/revisions\/1117"}],"wp:attachment":[{"href":"https:\/\/my.dev.vanderbilt.edu\/douglasfisher\/wp-json\/wp\/v2\/media?parent=1113"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}